CANADIAN 
OURNAL OF MATHEMATICS 


€ Uw, 


Journal Canadien de Mathématigtias 


© 196) 
VOL. XIII-NO.2 “’ 
1961 


On convex fundamental regions for a lattice A. M. Macbeath 
Longest increasing and decreasing subsequences C. Schensted 


Sylow theory for a certain class of operator groups 
Christine W. Ayoub 


On the derivation algebras of Lie algebras Shigeaki Togo 


An enumeration problem related to the number of 
labelled bi-coloured graphs C. Y. Lee 


Programmes in paired spaces K. S. Kretschmer 
Widths and heights of (0,1)-matrices 
D. R. Fulkerson and H. J. Ryser 


Sublattices of a free lattice Bjarni Jonsson 
Distributive sublattices of a free lattice 
Fred Galvin and Bjarni Jonsson 


Integration of subspaces derived from a linear 
transformation field Edward T. Kobayashi 


The Lebesgue constants for regular Hausdorff methods 
Lee Lorch and Donald J. Newman 


Some problems for typically real functions James A. Jenkins 


Properties of the coefficients of orthonormal 
sequences P. S. Bullen 


On a class of singular differential operators R. R. D. Kemp 


Properties of solutions of parabolic equations and 
inequalities M. H. Protter 


Graph theory and probability. II P. Erdos 
Published for 
THE CANADIAN MATHEMATICAL CONGRESS 
by the 


University of Toronto Press 


HEy, 





EDITORIAL BOARD 


H. S. M. Coxeter, G. F. D. Duff, R. D. James, R. L. Jeffery, 
J..M. Maranda, G. de B. Robinson, P. Scherk 
with the co-operation of 
B. DeLury, J. Dixmier, W. Fenchel, H. Freudenthal, I. Kaplansky, 
S. 


Mendelsohn, C. A. Rogers, H. Schwerdtfeger, A. W. Tucker, 
W. J. Webber, M. Wyman 


D. 
N. 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, G. F. D. Duff, University of Toronto. Authors are 
asked to write with a sense of perspective and as clearly as possible, 
especially in the introduction. Regarding typographical conventions, 
attention is drawn to the Author's Manual of which a copy will be 
furnished on request. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers 
is $10.00. This is reduced to $5.00 for individual members of recognized 
Mathematical Societies. 


The Canadian Mathematical Congress gratefully acknowledges the 
assistance of the following towards the cost of publishing this Journal: 


University of Alberta Assumption University 
University of British Columbia Carleton University 
Dalhousie University Ecole Polytechnique 
Université Laval Loyola College 
University of Manitoba McGill University 
McMaster University Université de Montréal 
Mount Allison University Nova Scotia Technical College 
Queen’s University St. Mary’s University 
University of Saskatchewan University of Toronto 

National Research Council of Canada 

and the 
American Mathematical Society 


AUTHORIZED AS SECOND CLASS MAIL, POST OFFICE DEPARTMENT, OTTAWA 











ON CONVEX FUNDAMENTAL REGIONS FOR A 
LATTICE 


A. M. MACBEATH 


Let A be a lattice in Euclidean n-space, that is, A is a set of points f;a; +... 
+ ,a, where a,,...,d, are linearly independent vectors and the £ run over 
all integers. Let u denote the Lebesgue measure. A closed convex set F is 
called a fundamental region for A if the sets F + x (x € A) cover the whole 
space without overlapping; that is, if F° is the interior of F, and 0 # x € A, 
then F°()\ (F° + x) = @. 

Let f(x) be a positive definite quadratic form. The set F consisting of all x 
which satisfy the inequalities f(x) < f(x + a),0 #a © A, is clearly a funda- 
mental region which (following Coxeter) we shall call the Dirichlet region of f 
and A. In his beautiful classical paper (4), Voronoi showed that if F is a 
primitive fundamental region (that is, if each of its vertices is a vertex of 
exactly m neighbours F + a,a € A), then F is the Dirichlet region associated 
with some quadratic form. The classification of non-primitive F has still not 
been achieved and, in particular, there is an unsettled conjecture of Voronoi 
that every F is a limit of primitive ones. 

It follows from Voronoi’s theorem that every primitive F possesses a centre 
of symmetry. In this note I prove the same result for non-primitive F. For a 
quite different proof, given in full only for three-space, see Minkowski (3). 

Let F be a fundamental region which is closed and convex. A set A is called 
a A-packing if A (\(A +x) = ¢ for 0 # x © A, and it is known (see, for 
instance, 2) that then u(A) < u/F). The set A is a packing if and only if the 
equation a; = a2 + x has no solution with a;, ag € A, 0 # x A, that is, 
on rewriting the equation x = a; — a, if and only if A — A contains no 
lattice point except the origin. 

Now F° is a A-packing, and, since F® is convex, 


PP =P + PY) 4 4+ PY 
=3(F — fF) —}(F — FP), 

it follows that $(¥° — F*) is a packing, and therefore 

(1) u(4( Fe — F)) < u(F) = u(P). 

Now by the Brunn-Minkowski theorem (1, pp. 88-91), we have 

(2) uaF? + 3(—F*))'" > u(P)™. 


From (1), (2), equality must hold in the Brunn-Minkowski theorem, 
is homothetic to — F*, that is, F* has a centre of symmetry. 


Received January 30, 1960. 











178 A. M. MACBEATH 


REFERENCES 


1. T. Bonnesen and W. Fenchel, Theorie der konvexen Kérper (Berlin: Springer, 1934). 
. A. M. Macbeath, Abstract theory of packings and coverings, Proc. Glasgow Math. Assoc., 
4 (1959), 92-95. 
3. H. Minkowski, Allgemeine Lehrsdtze ueber die konvexen Polyeder, Ges. Math. Abh., 2 
(1911), 103-121. 
4. G. Voronoi, Nouvelles applications des parametres continus a la théorie des formes quadratiques, 
Il. J. reine angew. Math., 134 (1908), 198-287. 


N 


Queen's College 
Dundee 





——— ee an 
SS 





dist 
row 
elen 
gap 


E 


witl 








LONGEST INCREASING AND DECREASING 
SUBSEQUENCES 


C. SCHENSTED 


This paper deals with finite sequences of integers. Typical of the problems 
we shall treat is the determination of the number of sequences of length n, 
consisting of the integers 1, 2,...,m, which have a longest increasing sub- 
sequence of length a. Throughout the first part of the paper we will deal 
only with sequences in which no numbers are repeated. In the second part 
we will extend the results to include the possibility of repetition. Our results 
will be stated in terms of standard Young tableaux. 


ParT | 


Definition. A standard Young tableau of order nm is an arrangement of n 
distinct natural numbers in rows and columns so that the numbers in each 
row and in each column form increasing sequences, and so that there is an 
element of each row (column) in the first column (row) and there are no 
gaps between numbers. 


Example. 247 
38 (order = 
59 


~I 
— 


Definition. The shape of a standard tableau is an arrangement of squares 
with one square replacing each number in the standard tableau. 


Example. The shape of 247 is as shown in Figure 1. 
38 
59 























Fic. 1. 


Received June 23, 1959; in revised form August 29, 1960. This work was conducted by 
Project MICHIGAN under Department of the Army Contract (DA-36-069-SC-78801), adminis- 
tered by the U.S. Army Signal Crops. 

The author would like to thank W. Richardson, G. Rabson, T. Curtz, I. Schensted, R. 
Thrall, and J. Riordan for illuminating discussions concerning this problem, and E. Graves 
for calculations which contributed to the solution. The problem originated as one aspect of 
a paper on sorting theory by R. Bear and P. Brock, Natural sorting, The University of Michigan, 
Willow Run Laboratories, Project MICHIGAN Report 2144-278-T, submitted for publication 
in Soc. Ind. App. Math. 


179 











180 Cc. SCHENSTED 


One reason that standard tableaux are so useful to us is that it is easy to 
compute the number of standard tableaux of a given shape either by means 
of a simple recurrence relation, or by means of the following elegant result; 
Frame, Robinson, and Thrall (1). 


THEOREM. The number of standard tableaux of a given shape containing the 
integers 1,2,...,m 4s 


n' 


I] 4, 


j=l 


(1) 


Here the 4, are the hook lengths, that is, the number of elements counting 
from the bottom of a column to a given element and then to the right end 
of the row. 


Example. To compute the number of standard tableaux of the shape shown 
in Figure 2(a), we first find the hook lengths, which are shown in Figure 





| lé6lsi3ti] 
4/3]! 
2} 1 









































Fic. 2(a). Fic. 2(5). 


2(6). Then we find that the number of standard tableaux of this shape is 
9! 
—___—_________—— == 168. 
6-5-3-1-4-3-1-2-1 
Definition. S — x is defined as the array obtained from the standard tableau, 
S, by means of the following steps: 


(i) Insert x in the first row of S either by displacing the smallest number 
which is larger than x, or if no number is larger than x, by adding x at the 
end of the first row. 


(ii) If x displaced a number from the first row, then insert this number 
in the second row either by displacing the smallest number which is larger 
than it or by adding it at the end of the second row. 


(iii) Repeat this process row by row until some number is added at the 
end of a row. 


In the above steps “‘adding at the end of the row”’ is interpreted as putting 
in the first column in the given row if the row does not yet have any entries 
in it. We define x — S similarly except that we replace the word ‘‘row’’ by 
the word “column” throughout. 











ig 
d 


ie 








INCREASING AND DECREASING SUBSEQUENCES 18] 


Example. If S=247 then 
38 
59 
246 247 
S-—6=37 and 6—S=38 
58 59 


=) 


LEMMA 1. Sx and x — S are standard tableaux. 


Proof. Since the proofs for Sx and x — S are similar we consider only 
S— x. 

First we note that if two consecutive rows of S have the same length, and 
if a number is displaced from the first of these two rows, then it will either 
displace the number which was standing under it or else some number to 
its left, and thus will not be added at the end of the row. Thus a row cannot 
be made longer than the row above it and S — x cannot fail to be a standard 
tableau on account of its shape. Thus we have only to prove that the num- 
bers in each row and column still form increasing sequences. 

A number is inserted into a row in such a place that the number to its 
left (if any) is smaller, and the number to its right (if any) is larger. Thus 
the numbers in each row form increasing sequences. 

The number (if any) which ends up below a number which is inserted at 
a new position is either the number which it displaced, which is therefore 
larger, or else the number which previously stood below the number which 
it displaced, which is larger still. 

When a number is displaced from one row to the next it ends up either in 
the position directly beneath the one in which it originally stood, or else 
further to the left (since it is smaller than the number which previously stood 
underneath it). Thus it is either under the number which displaced it, which 
is therefore smaller, or else a number to the left of it, which is smaller still. 

The last two paragraphs show that two consecutive numbers in a column 
form an increasing sequence if either of them has just been inserted into its 
present position. If neither of them has just been inserted, then they are the 
numbers which were previously there in S and which therefore are in in- 
creasing order. Hence the columns also form increasing sequences and the 
proof of the lemma is completed. 


Definition. The P-symbol corresponding to a sequence of distinct integers 
X1X2...%X, is the standard tableau (...((x;<— x2) —x3)...¢-x,). The 
Q-symbol corresponding to the same sequence is the array which is obtained 
by putting & in the square which is added to the shape of the P-symbol when 
x, is inserted in the P-symbol. 














182 C. SCHENSTED 


Examples. 
Sequence 3 35 354 3549 35498 354982 3549827 
P-symbol 335 34 349 348 248 247 
5 5 59 39 38 
5 59 
Q-symbol ; ae a2 124 124 124 124 
3 3 35 35 35 
6 67 


LEMMA 2. The Q-symbol corresponding to an arbitrary sequence is a standard 
tableau. 


Proof. Since the Q-symbol has the same shape as the P-symbol, and since 
the P-symbol is a standard tableau, the shape of the Q-symbol is legitimate. 
Each digit added to the Q-symbol is larger than all of the previous digits, and 
in particular is larger than the digits above it and to its left. Hence the 
numbers in each row and column form increasing sequences, and the lemma 


is established. 


LeMMA 3. There is a one-to-one correspondence between sequences made with 
the n distinct integers x1, X2,...,X, and ordered pairs of standard tableaux of 
the same shape—the first containing x,,X2,...,X, and the second containing 
SS Seat 


Proof. Given a sequence, the P-symbol and Q-symbol are uniquely deter- 
mined standard tableaux of the type mentioned in the lemma. Given a pair 
of standard tableaux of the appropriate types we can find the unique sequence 
which could have them for a P-symbol and Q-symbol as follows: The position 
of the largest number in the second tells us which number was added on to 
a row of the first without displacing another number when the last digit was 
inserted. This must have been displaced from the previous row by the largest 
number which is smaller than it (there always will be at least one number 
smaller than it in the preceding row since the one directly above it is smaller). 
This in turn must have been displaced from the next row up. Finally we get 
to the first row and discover what number was inserted into it. This is the 
last digit of the sequence. We now also know what the P-symbol and Q-symbol 
were before the last digit was inserted. Thus we can repeat the procedure to 
find the next to the last digit of the sequence. This proves the lemma. 


Note. Since there are n! possible sequences of x), x2,...,%,, Lemma 3 
shows that there are nm! ordered pairs of standard tableaux of order n such 
that the shapes of tableaux in each pair are the same, but the shapes of 
tableaux in different pairs are not necessarily the same. This fact is already 
known (2). Of course, the number of ordered pairs of standard tableaux of a 
given shape is equal to the square of the number of standard tableaux of 
that shape, which is given in turn by Expression (1). 








—_ 


—_ —— 








~“ 

















INCREASING AND DECREASING SUBSEQUENCES 183 


Definition. The jth basic subsequence of a given sequence consists of the 
digits which are inserted into the jth place in the first row of the P-symbol. 


LemMA 4. Each basic subsequence is a decreasing subsequence. 


Proof. Each number in the jth basic subsequence, on insertion in the first 
row displaces the previous member of the jth basic subsequence, which must 
therefore be larger than the present member. 


LEMMA 5. Given any member of the jth basic subsequence, we can find a member 
of the (j — 1)st basic subsequence which is smaller and which occurs further to 
the left in the given sequence. 


Proof. The number in the (j — 1)st place in the first row, when the given 
member of the jth basic subsequence is inserted, is such a member of the 
(j — 1)st basic subsequence. 


THEOREM 1. The number of columns in the P-symbol (or the Q-symbol) is 
equal to the length of the longest increasing subsequence of the corresponding 
sequence. 


Proof. The number of columns is the same as the number of basic subse- 
quences. By Lemma 4 there can be at most one member of each basic sub- 
sequence in any increasing subsequence. By Lemma 5 we can construct an 
increasing subsequence with one element from each basic subsequence, 
Q.E.D. 


Note. The proof shows us how to actually obtain in increasing subsequence 


of maximal length. 
LEMMA 6. (x ~ S)—y = x— (Sy). 


Proof. Suppose first, that of all the digits in x, y, and S, the largest is y. 
We represent S schematically by Figure 3. There are two cases of interest. 




















Fic. 3. 


The square added to the shape of S in x — S is in the first row, or it is not. We 
represent x — S schematically in these two cases by Figure 4(@) and 4(0) 
respectively, where x’ is the number added to the end of some column 
without displacing another number when we form x — S. It is easily verified 














184 Cc. SCHENSTED 





x’ 
































Fic. 4(a). Fic. 4(0). 


that in the first case the final result is as shown in Figure 5(a) and in the 
second case the result is that of Figure 5(6). 



































x‘ly 
(x>S)+y = = x-> (Sy) 
Fic. 5(a). 
y 
(x—>S) y= x] = x->(S~<—y) 
Fic. 5(6). 


This proves the lemma if y is the largest number involved, and the proof 
is similar if x is the largest number involved. 

Suppose now that, of all the digits in x, y, and S, the largest is .V, and 
that V is in S. In this case we use induction. The lemma can be easily verified 
by direct calculation if S is of order 0, 1, or 2. We assume the lemma true 
for S of order m, and prove that it is then true for S of order nm + 1. 

Let us suppose, then, that S is of order » + 1. Now, since N is the largest 
number in S, we see that N is at the end of whatever row it is in, and also 
at the end of its column. Thus, if we remove N from S we will obtain a new 
standard tableau, S’, of order n. Now since N is larger than any of the other 
numbers, it can never displace any of them, and hence the presence or absence 
of V cannot have any influence on the position of the other numbers. Thus 
(x — S) <— y will be the same as (x — S’) — y except that N is added some- 
where, and x — (S+ yy) will be the same as x — (S’ <— y) except for the 
addition of V. However, since S’ is of order n, we have by assumption 


(x9 S’) —y = x > (S’— y). 


Thus 


and : 


for e 
N,x 
with 
these 
If 
and 

The 


8 (¢ 


di 


Ww 











ie 





INCREASING AND DECREASING SUBSEQUENCES 185 


Thus we have only to prove that V occupies the same position in (x — S) — y 
and x — (S < y) to prove the lemma. The truth of this can be easily verified 
for each of the possible cases which can arise as to the relative locations of 
NV, x’, and y’. Here x’(y’) is the number which is added to some column (row) 
without displacing another number when we form x — S’(S’ — y). In making 
these verifications it is necessary to keep the following facts in mind. 

If x’ and y’ do not fall into the same square, then we represent S’, x — S’, 
and S’«-y schematically by Figure 6(a), 6(6), and 6(c) respectively. 
The shape of (x —5S’)<y must have a square added to the shape of 









































x’ 





























Fic. 6(¢). Fic. 6(0). Fic. 6(c). 


x— S’, and the shape of x — (S’<—y) must have a square added to the 
shape of S’<«—y. By assumption (x — S’) —y = x — (S’<y) so that the 
shape of (x — S’) —y and x — (S’ — y) must be Figure 7. 











ay 








J 


Fic. 7. 











If x’ (in x ~ 5S’) and y’ (in S’ — y) occupy the same position then we 
schematically represent S’, x —S’, and S’«y by Figure 8(a), 8(b), and 
8(c) respectively. Here the shaded parts of x S’ and S’<—y are the 



































Fic. 8(c). Fic. 8(0). Fic. 8(c). 














regions where numbers could have been displaced. Now let us suppose 
that y’ > x’. Then when we insert y into x — S’ the same numbers will be 
displaced in each row as were displaced when we inserted y into S, until 
we displace y’. 











186 C. SCHENSTED 


In S’ —y we would have put y’ where x’ is, but y’ > x’, thus y’ will be 
added at the end of the row containing x’, and the shape of (x — S’) —y 
(and hence of x — (S’ —y)) will be Figure 9. If we had had x’ > y’, then 











tI 











Fic. 9. 


the shape of (x — S’)<—y and x — (S’<—y) would have been Figure 10. 
Thus, if we know the shapes of x — S’ and S’ — y, and if we know whether 
x’ > y' or x’ < y’, then we know the shape of (x — S’) — yand x > (S’ — ). 











o} 


Fic. 10. 











Now we can return to the problem of showing that V has the same position 
in (x + S) —y and x — (S+ y). As we mentioned there are several special 
cases. We will consider only three of these as the others go in the same way. 
First suppose that the position of NV in S does not coincide with either the 
position of x’ in x —S’ or the position of y’ in S’<—y. Then N will never 
be displaced and it will have the same position in (x — S) — y and x — (Sy) 
as it does in S. 

Next suppose that the position of N in S coincides with the position of x’ 
in x — S’, and that the position of y’ in S’ — y lies to the left of this. Then 
we have schematically Figure 11. 

Finally suppose that the position of N in S coincides with the position 
of x’ in x — S’, and that the position of y’ in S’ — y lies one column to the 
right of this. Then schematically we have Figure 12. Proceeding similarly we 
can verify all of the other special cases, and hence the validity of Lemma 6. 


LemMMA 7. If one sequence is a second sequence written backwards, then P- 
symbol of the first is obtained from the P-symbol of the second by interchanging 
rows and columns. 


Proof. First we note that x — y = xy since if x < y they are both 


xy and if x > y they are both A Now we define P(x, X2,...,%n) = (..- ((% 








INCREASING AND DECREASING SUBSEQUENCES 187 












































r>S= x Si<-y= 












































x-—>S= ix’ N S<y= N 









































(x->S)—-y= LIN] = x-»(S<y) 


_] 


Fic. 11. 

















— Xo) — 3)... x,) and P(x, X2,...,X%n) = (Xr... (Xp 2 > (Xq~-1 Xn )) 


..+). Next we assume that P(x, x2, ..., X%e-1) = P (x1, x2, ..., Xa-1) and that 


P(x, X2,...,%n) = P(x1, x2,...,%_) and prove that P(x,, X2,..., Xn» Xn+1) = 
P(x1, X2,..-., Xn» Xn41). (We have just shown that P(x;, x2) = x1;<— x2 = 
X17 Xe = P(x,, x2), furthermore P(x,;) = x; = P(x,).) We have 

P(x, Xe, 2 2» Xny Xn41) = P(x, Se, 2 +e eka) S Bust 


=m P(x1, Xo, ..., Xn) — Xnsl 


= [xy —> P(xo,... , Xn) ] — Xe41 
= x, > [P(xe, . .. , Xn) — Xng1 | 


= x,— [P(x2,...,: Xn) — Xn4i] 
= x, —> P(x, ... , Xn» Xn+1) 

= x, — P(xo,.. . , Xn, Xn41) 

> » Sus Hest) 


Of these lines, the second, fifth, and seventh follow by assumption, and the 














188 C. SCHENSTED 







































































S = 
x—+>S'= S'+-y= 
x-—>S= 
(x->S)~+y= ry =x—>(S + y) 
x’ 
Fic. 12. 

fourth from Lemma 6. Now P(x, ...,%X,) is the P-symbol for the sequence 
X1,X2,...,%—_, while P(x,, x2,...,%,) is the P-symbol for the sequence 
Xny +++» %2, %; With rows and columns interchanged. Hence the lemma follows. 


Note. It must not be assumed that Lemma 7 holds for Q-symbols. 


THEOREM 2. The number of rows in the P-symbol (or the Q-symbol) is equal 
to the length of the longest decreasing subsequence of the corresponding sequence. 


Proof. This follows immediately from Theorem 1 and Lemma 7, since 
writing a sequence backwards changes increasing subsequences into decreasing 
subsequences. 

THEOREM 3. The number of sequences consisting of the distinct numbers 
Xi, X2,...+,Xn, and having a longest increasing subsequence of length a and a 
longest decreasing subsequence of length 8, is the sum of the squares of the numbers 
of standard tableaux with shapes having a columns and 8B rows. 





nu 





INCREASING AND DECREASING SUBSEQUENCES 189 


Proof. Follows immediately from Lemma 3 and Theorems 1 and 2 (see 
also the note to Lemma 3). 


Example. To find the number of permutations of 1, 2,3,...,25 having a 
longest decreasing subsequence of length three and a longest increasing sub- 
sequence of length 21 we note that the only allowed shapes with 25 squares, 
21 columns, and 3 rows are those of Figure 13. 


SARE ERSCHS THURS 

















SISESAL EEA 




















Fic. 13. 


By the Frame—Robinson-Thrall theorem, the corresponding numbers of 
standard tableaux are 21,000 and 31,350 respectively. Thus the desired 
number of permutations is 


21,000? + 31,350* = 1,423,822,500. 


Part II 


We now want to consider sequences in which some of the numbers are 
repeated. We can obtain the properties of such sequences in terms of sequences 
without repetitions by a simple artifice. Suppose the smallest number appears 
p times in the sequence, the next smallest g times, etc. We replace the p 
occurrences of the smallest number by the numbers 1,2,..., (in this 
order), the g occurrences of the next number by p+ 1,64 2,...,p4+ 4, 
etc. Then the decreasing subsequences of the two sequences will be in one- 
to-one correspondence, while the increasing subsequences of the new sequence 
will be in one-to-one correspondence with the non-decreasing subsequences 
of the original sequence. 


Example. Given the sequence 33 2341, we replace 1 by 1, 2 by 2, the 
three 3’s by 4, 5, 6, and 4 by 7. The result is 45267 1. The latter sequence 
has a decreasing subsequence 521 which corresponds to a decreasing sub- 
sequence 321 in the original and an increasing subsequence 4567 which 
corresponds to a non-decreasing subsequence 3 3 3 4 in the original. 

If we construct the P-symbol for the derived sequence, and map the 
numbers in it back to the numbers in the original sequence, then we get a 
modified standard tableau in which repeated numbers are allowed, the numbers 
in each column form an increasing sequence, and the numbers in each row 
form a non-decreasing sequence. Since the numbers in the Q-symbol refer to 











190 Cc. SCHENSTED 


the order of addition of spaces to the P-symbol, the Q-symbols of the two 
sequences will be identical. 

We can get modified forms of each of the results in Part I. The main result, 
Theorem 3, now takes the form: 


THEOREM 4. The number of sequences of x1, X2,..., Xp» having a longest non- 
decreasing sequence of length a and a longest decreasing sequence of length B is 
the sum of the products of the number of modified standard tableaux of a given 
shape with the number of standard tableaux of the same shape, the shapes each 
having a columns and 8 rows. 


Example. To find the number of sequences of seven numbers consisting 
entirely of 1's, 2’s, and 3’s having a longest non-decreasing sequence of length 
four and a longest decreasing sequence of length three, we proceed as follows. 
The possible tableaux must have the shape of Figure 14. 


LJ 




















* 


Fic. 14. 


The possible modified standard tableaux are 


Bee 63.3 

22 , ae : 

3 3 
1112 1112 1113 1113 1122 
22 i, », 22 » 23 » 22 , 
3 : 3 3 K 
1122 1123 1123 1133 1133 
23 » 22 » 23 », 22 , 28 : 
3 3 3 3 3 
1222 1223 1233 They are 15 in number. 
23 , 23 , 23 
3 3 3 


By the Frame—Robinson-Thrall theorem the number of standard tableaux 
of this shape is 35. Hence the number of sequences of the desired type is 
15 X 35 = 525. 

As a further example we will work out explicit formulae for binary sequences 
(sequences consisting of 0’s and 1's). In this case the modified standard 
tableaux have the general form of Figure 15, where the bracketed region can 
have any division of 0’s and 1’s (the 0’s preceding the 1's, of course). 





OlOlojojo; | | | } 
Ino 





























sut 


m 





INCREASING AND DECREASING SUBSEQUENCES 191 


Let n be the number of digits in the sequence. Let m be the length of the 
longest non-decreasing subsequence. Then there are no sequences for which 
m <n/2. If m =n the longest decreasing subsequence is of length 1. If 
n/2 <m <n, the longest decreasing subsequence is of length 2. 

The number of possible modified tableaux is 2m — n + 1. The number of 
standard tableaux is 


n! 
(m + 1)!(n — m)!° 


Thus the number of binary sequences of n digits with a longest non-decreasing 
subsequence of length m is 





(2m — n+ 1) 


n'(2m — n +1)? 
(m + 1)!(n — m)!° 


Note. Since the total number of binary sequences is 2" we have 





n\(2m —n + 1)’ 


? = A... Ae te a) 
m> nj2(m + 1)!(n — m)! 
In the above derivation we allowed all possible binary sequences. Theorem 
4 also readily solves the problem if the number of 0’s and I's in the sequence 
is fixed. In this case there is at most one modified tableau and thus the number 
of sequences of n digits with a longest non-decreasing subsequence of length 
m is 
n\(2m — n + 1) 
(m + 1)!(n — m)! 
with the additional restriction that the number, p, of 0’s in the sequence 
must satisfy » —m < p < m. 





Note. This shows that 


. ¢< ni(2m — n + 1) 
(") *. oe (m + 1)!(n — m)!° 


Throughout Part II we could have dealt equally well with increasing and 
non-increasing subsequences rather than decreasing and non-decreasing 
subsequences. 





REFERENCES 


1. J. S. Frame, G. de B. Robinson, and R. M. Thrall, The hook graphs of the symmetric group, 
Can. J. Math., 6 (1954), 316. 
2. D. E. Rutherford, Substitutional analysis (Edinburgh University Press, 1948), p. 26 


Institute for Defence Analysis 
Princeton 











SYLOW THEORY FOR A CERTAIN CLASS OF 
OPERATOR GROUPS 


CHRISTINE W. AYOUB 


1. Introduction. In this paper we consider again the group-theoretic 
configuration studied in (1) and (2). Let G be an additive group (not neces- 
sarily abelian), let M be a system of operators for G, and let ¢ be a family of 
admissible subgroups which form a complete lattice relative to intersection 
and compositum. Under these circumstances we call G an M — ¢ group. In 
(1) we studied the normal chains for an M — ¢ group and the relation between 
certain normal chains. In (2) we considered the possibility of representing 
an M — ¢ group as the direct sum of certain of its subgroups, and proved that 
with suitable restrictions on the M — ¢ group the analogue of the following 
theorem for finite groups holds: A group is the direct product of its Sylow 
subgroups if and only if it is nilpotent. Here we show that under suitable 
hypotheses (hypotheses (I), (II), and (III) stated at the beginning of § 3) 
it is possible to generalize to M — ¢ groups many of the Sylow theorems of 
classical group theorem. The most important of these is the existence theorem— 
Theorem 3.1. 


2. Definitions and preliminary results. In order to make this paper as 
self-contained as possible we shall summarize in this paragraph the definitions 
and results which we shall use from the two previous papers (1) and (2). 

Let G be an M — @ group. The subgroups belonging to the lattice ¢ are 
called @ subgroups. The following notions are defined in the obvious manner: 
M — @ isomorphism, M — ¢ automorphism, M — ¢ homomorphism, the 
M — ¢ quotient group G/N (where N is a normal @ subgroup of G). The 
analogues of the Homomorphism Theorem and the Isomorphism Theorems 
hold (see (1) for a statement of these definitions and theorems). 

Throughout this paper we shall assume that G possesses a ¢ composition 
series all of whose factors are abelian, that is, there is a chain of ¢ subgroups 


(1) Se Bet 4. K, Sek Aut... ae @ G, 


aor n— 1, such that each factor 
A i441/A;, is abelian and ¢ simple, that is, has no proper normal ¢ subgroups 
(# 0). We call (1) a ¢ composition series of length m and we say that G is ¢ 
soluble. The analogue of the Jordan Hélder Theorem tells us that any two ¢ 
composition series have the same length and M — ¢ isomorphic factors. If 


where A, is normal in A, for 1 =0 


Received September 2, 1959. Much of the work of this paper was done when the author was 
the holder of the ZAE post-doctoral fellowship. 





192 











retic 
eces- 
ly of 
“tion 
». In 


veen 
iting 
that 
ving 
vlow 
able 
§ 3) 
is of 
m— 


T as 


ions 


are 
ner: 

the 
The 


ems 


tion 
ups 


>tor 
ups 
is @ 
o@¢ 
- 


was 











SYLOW THEORY FOR OPERATOR GROUPS 193 


the chain (1) consists of normal ¢ subgroups of G and if A 44;/A, contains no 
proper normal @ subgroups of G/A, (# 0) we call (1) a principal ¢ series. 

If a is an element of G, the intersection of all ¢ subgroups which contain a 
is called the ¢ cyclic subgroup generated by a. The M — ¢ group is said to be 
¢ nilpotent if the upper central ¢@ chain joins 0 and G (for a definition of @ 
centre and central @ chain see (2), Definitions 5.1 and 5.2). The M — ¢ group 
P is said to be primary with characteristic F if it possesses a ¢ composition 
series all of whose factors are M — ¢ isomorphic to F. We shall make extensive 
use of the following theorem, which is proved in (2) (Theorem 7.2 under 
hypotheses (i), (ii’), and (iii)—see Remark after Corollary 7.1): 

(A) Let P bea primary M — ¢ group with abelian characteristic and assume 
that 

(i) Inner automorphisms are M — ¢ automorphisms. 

(ii) The ¢ cyclic @ subgroups of P are abelian. 

(iii) Any @ subgroup of P has a finite number of conjugates. 
Then P is ¢ nilpotent. 

We also need (Theorem 7.4 of (2)): 

(A’) Let G be an M — @ group which possesses a ¢ composition series. 
Assume (i) of (A), and also that unitoral ¢ cyclic @ subgroups are primary. 
Then if G is @ nilpotent, G is the direct sum of primary ¢@ subgroups. (An 
M — ¢ group is unitoral if it possesses a unique maximal normal ¢ subgroup.) 

We shall also make use of the following result—the proof is an easy generaliz- 
ation of the argument used for ordinary groups (see, for example, (3)): 

(B) Let G be an M — ¢ group which possesses a ¢ composition series; and 
let N be a minimal normal ¢ subgroup of G. Then WN is the direct sum of a 
finite number of M — ¢ isomorphic ¢ simple @ subgroups. 

The ¢ subgroup S of G is said to be a ¢ link if there is a normal ¢ chain 
connecting S and G, that is, if there exist ¢ subgroups S, such that 


(2) S=SC...CS§,;C Sui1C...C& = G, where S, is normal in S44. 


It is easy to see that if G possesses a ¢ composition series, the ¢ links satisfy 
the double chain condition. We shall need the following result concerning ¢ 
links (see Theorem 5.2 of (2)): 

(C) If the M — @ group G is ¢ nilpotent, then any ¢ subgroup of G is a 
@ link. 

The following notations will be used: If A and B are subgroups of the 
group G, {A, B} denotes the compositum of A and B. If S is a subgroup of 
the group G and g an element of G, S(g) denotes the conjugate subgroup 
—g+S5-+. The notation Z,(G) is used for the ¢ centre of G. The symbol 


~~ 


(M — 4) is used for M — ¢ isomorphism. 


3. The existence theorem. Throughout this paper we assume that G is a 
@ soluble M — ¢ group which satisfies the following hypotheses: 














194 CHRISTINE W. AYOUB 


(1) Inner automorphisms are M — ¢ automorphisms. 
(II) ¢@ cyclic ¢ subgroups of G are abelian, and unitoral ¢ cyclic @ subgroups 
are primary. 
(III) Any ¢@ subgroup has a finite number of conjugates. 


Definition 3.1. Let G have a @ composition series which has F as a ¢ com- 
position factor of multiplicity (exactly) m. Then if S is a primary ¢ subgroup 
of G with characteristic F and ¢ composition length m, S is called an F Sylow 
subgroup of G. 


Definition 3.2. If K is a @ subgroup of G, the ¢ normalizer of K in G, N,(K) 
is the maximal ¢ subgroup of G in which K is normal. 


THEOREM 3.1.' For each @ composition factor F of G there exists an F Sylow 
subgroup S of G. Furthermore, the F Sylow subgroups of G are all conjugate; 
and if there is more than one, F is of finite order f, and the number of F Sylow 
subgroups ts congruent to 1 modulo f. 


Proof. We use induction on j7(G), the ¢ composition length of G. If 7(G) = 1, 
the theorem is obviously true. Assume the theorem true for all M — ¢ groups 
H with 7(H) < 7(G); and assume that the ¢ simple abelian group F is a ¢ 
composition factor of multiplicity m for GC. 

We consider first the case where G contains a normal primary ¢ subgroup of 
characteristic F’ not M — @ isomorphic to F and where G/N is primary of 
characteristic F. Let H be a maximal normal ¢ subgroup of G which contains 
N; hence 


(M — ¢) 


Let s be an element of G not in H. Then if S is the ¢ cyclic ¢ subgroup of G 
generated by s, 


G/H F. 


S/S) Hiv = ¢)> +H Hiv a 9° Au a »)/ 
so that S has F as a ¢ composition factor. Since S is ¢ cyclic, it is abelian and 
hence by (A’), S = S; + Se, where char (.S;) = F and char (S:) = F’. Now 
S.C N C H, and hence S; is not contained in H since S is not contained in 
H. 

If S; is an F Sylow subgroup of G, then the existence of an F Sylow sub- 
group for G is proved. Otherwise NV + S, is a ¢ link and hence is contained in 
a maximal normal ¢ subgroup of G, say L. 

Since H and L are proper ¢ subgroups of G, we know from the induction 
assumption that H contains an F Sylow subgroup 7, and L an F Sylow sub- 
group W. Furthermore,H = N+ 7,NA\T=0;L=N+W,NOW =0. 


'The author would like to thank the referee for his suggestions regarding the proof of 
this theorem. 











SYLOW THEORY FOR OPERATOR GROUPS 195 


Form an ascending ¢ chain from N to G: N = NgCN,C...CNiC 
NaiC...C Nan = HC Na = G, where Ni:/N; is @ simple and is 
contained in the ¢ centre of G/N,; such a chain can always be constructed 
since G/N is primary and hence ¢ nilpotent. 

Define U; = N,(\T, fori =0,...,28+1. ThnO=U,.C...CU:C 
UaiCc...C Un, = T and N; = N + U, for 1 =0,...,m, since NC 
N,G Hand H = N +T. 

We show that if U, C W, then U, is normal in W. 

For if wand u, are elements of W and U, respectively, then since NV ,/N ,, C 
Z.(G/Ni-1), —w + uy, + w = u, (mod N-__,). It follows that —w + u, + w = 
u, + Uy1 + x, where x and u,_, are elements of N and U,_, respectively, 
since N,y = U;1 + N. Hence x = —u,; — u, —w+u,+w is an ele- 
ment of V(\ W = 0. Thus —w + u, + w = u, + u,;_, is an element of U, 
so that U, is normal in W. 

It follows from the fact that 0 = Up, is contained in W but T = U,,_, is not 
contained in W (since H = N + T is not contained in L), that there exists 
an integer s such that U,_, is contained in W but U, is not contained in W. 
We assume that W is a maximal F subgroup of G and show: 

(i) W(u,) = W for u, in U, if and only if u, isin WC) U,. 

(ii) F has finite order, say f. The number of conjugates W(u,) with x, 
in U, is [U,: WO\ U,] = 0 (mod f). 

(iii) The total number of conjugates of W is = 0 (mod f). 


Proof of (i). Assume that W(u,) = W for some element u, of U,. On the 
other hand, —u, + w+ u, = w (mod N,_;) since G and N, commute, mod 
N,-1 = Uy-1 + N. Hence w’ = —u, +w+u, = wt uy; + x, where u,_; 
and x are elements of U,_; and N respectively. Thus x = —u,_, — w+ w’ is 
an element of W(\ N = 0; or —u, + wt+u, = wt Uy-1. 

Form 

Q= f) U,(w). 
win W 
Then Q(w) = Q for win W. It follows that {Q, W} = Q+ Wisan F subgroup 
of G. But by hypothesis W is a maximal F subgroup of G; hence Q is contained 
in W. But wu, is an element of Q; for u, = —w+u, + w+ u,_, and hence 
is an element of U,(w) since —w + u, + w is in U,(w) and u,_, is in U,_, 
= U,_;(w) C U,(w). Thus u, isin WP) U,. 


Proof of (ii). By hypothesis, W has a finite number of conjugates and by 
(i), W(u,’) = W(u,) if, and only if, (u,’ — u,) is in WC) U,. Therefore, the 
number of conjugates W(u,) is [U,: WC\ U,]. Now since U, is ¢ nilpotent, 
W C\ U,isa ¢ link for U,, and hence there exists a ¢ composition chain joining 
W\ U, to U, and all the ¢@ composition factors are M — ¢ isomorphic to 
F. Hence F has finite order, say f, and [U,: W C\ U,] is divisible by f. 


Proof of (iii). Let W’ be any conjugate of W; then there is an integer s’ 
such that U,-_; is contained in W’, but U,- is not contained in W’. We call s’ 











196 CHRISTINE W. AYOUB 


the integer associated with the group W’. Since W is by hypothesis a maximal 
F subgroup of G, W’ is also a maximal F subgroup and hence applying (ii) 
to the group W’, we see that the number of conjugates W’(u,-) with u,- in 
U, is divisible by f. 

Choose a conjugate W, of W so that the integer s(1) associated with W, 
is as large as possible. Assume that the conjugates W, have been defined for 
j <tand that s(j) is the integer associated with W,. If the groups W,(u,;») 
for u..» in Us» do not exhaust the conjugates of W, choose W, so that: 

(a) W, is different from W,;(us») for up in Uscy. 

(b) The integer s(z) associated with W, is as large as possible. 

Since W has a finite number of conjugates, there are a finite number of 
groups W,, say m. We show that W,(uyseo) # W;(usy) for 7 < 7. For if 
Wiluso) = Winn), Wi = Si(Uacy — Uno), and since s(i) < s(j) implies 
that Ug» S Uap, Uy — Uy is in Us. But it follows from the definition 
of W, that this is impossible. Hence the groups W,(u.,) with uy in Ugg 
(1 <i<7n) are all distinct. But for fixed i, the number of conjugates 
W (us: ») is divisible by f. Thus the total number of conjugates of W is divisible 
by f. 

Hence we have shown that under the assumption that W is a maximal F 
subgroup of G, the number of conjugates of W is divisible by f. But any con- 
jugate of W is an F Sylow subgroup of L, and by the induction assumption 
all the F Sylow subgroups of L are conjugates of W (in L and hence in G), 
and their number is congruent to 1 modulo f. Thus if W is a maximal F sub- 
group of G we have a contradiction. Hence G contains an F subgroup P which 
properly contains W. It is clear that P is an F Sylow subgroup for G. This 
proves the existence part of the theorem in the particular case we were treat- 
ing—that is where G contains a normal primary @¢ subgroup of characteristic 
F’ not M — ¢ isomorphic to F, and G/N is primary of characteristic F. We 
prove next the existence part of the theorem in the general case. 

Let G be an M — ¢ group, one of whose ¢ composition factors is F, and 
assume that G is not primary. Let NV be a minimal normal ¢ subgroup of G; 
by (B) N is primary. Either JN is itself an F Sylow subgroup or else we have 
one of the following: 

(a) N is primary of characteristic F’ not M — ¢ isomorphic to F, and 
G/N is primary of characteristic F. 

(b) G/N is not primary. 

If (a) holds, we have the special case treated above. If on the other hand, 
G/N is not primary, by the induction assumption it contains an F Sylow sub- 
group K/N. Furthermore, since K # G we can use induction again to obtain 
an F Sylow subgroup S of K. Clearly S is also an F Sylow subgroup of G. This 
completes the proof of the existence of an F Sylow subgroup in the general 
case. 

We now turn to the second part of the theorem, still using induction on 
j(G), the length of a ¢ composition series for G. Let H be a maximal normal @ 











m 





SYLOW THEORY FOR OPERATOR GROUPS 197 


subgroup of G. If G/H is not M — ¢ isomorphic to F, and T is an F Sylow 
subgroup of G, then T is contained in H. For otherwise G = H + T and 
~ 
T/H(\T,,,  ,,G/H not (M — ¢) isomorphic to F 
(M — ¢) 
which is impossible. Hence in this case all the F Sylow subgroups of G are F 
Sylow subgroups H and conversely. Since j(H) < j(G), the F Sylow subgroups 
are all conjugate in H and hence in G. If there is more than one, F is finite of 
order f and the number is congruent to Imodulo f. 
Suppose next 


(M — ¢) 


and H does not have F as a ¢ composition factor. Let S be an F Sylow sub- 


G/H F, 


group of G. Then 
Siu — 6)" 

and G = H+ S, H(\S = 0. If Sis the unique F Sylow subgroup of G, then 
there is nothing to prove. Otherwise let T be an F Sylow subgroup of G distinct 
from S. We prove that if T(s) = 7 for sin S, then s = 0. 

Let ¢ be an element of T. Then —s ++ s =f is an element of 7. Now 
—s+t+s-—t isin H since G/H is abelian; and also (—s +t+5) -t= 
i — tisin 7. Thus —s +¢+s —tisinHf\T = 0. Hence for any tin 7, —s 


+i+s=tors = —t+s++#sothatif 7(s) = 7, s is in S(¢) for every ¢ in 
T. Let 
Q= 1) S(t); 
tT 


Q(t) = Q for ¢ in T. Hence {Q, T} = Q+T and therefore {Q, 7} = T or 
Q C T. Thus s is in T since s is in Q. But S(\ T = 0; therefore s = 0. 

Thus if s; and sg are distinct elements of S, T7(s;) # T(s2) so that the 
number of conjugates 7(s) with s in S is equal to the order of F (since S => F). 
By hypothesis, T has a finite number of conjugates. Hence F has finite order f. 

If the subgroups 7(s) with s in S do not exhaust the conjugates of 7 distinct 
from S, let 7, be such a conjugate of JT. Then there are f conjugates 7;(s) with 
s in S. Continuing in this way, we find that the number of conjugates of T 
distinct from S is congruent to 0 modulo f. However, if S is not a conjugate of 
T, we may replace S by T’, a conjugate of T, in the argument above and obtain 
the result that the number of conjugates of T is congruent to 1 modulo f. 
Thus if S is not a conjugate of T we have a contradiction. Therefore, S is a 
conjugate of T and the number of conjugates of T is congruent to 1 modulo f. 

Finally, suppose 
' = . 

/Hivg sé gf 


and H contains F as a ¢ composition factor. Let T be an F Sylow subgroup of 


G 











198 CHRISTINE W. AYOUB 


G; then S,; = H(\T is an F Sylow subgroup of H contained in 7, and S, is 
normal in 7. Furthermore, if S2 is any F Sylow subgroup of H contained in 7, 
then {.S,, S2} = S; + S:isan F subgroup of H, but S; and S; are both maximal 
F subgroups of H; therefore, S,; + S. = S,; = S:. On the other hand, if S 
is any F Sylow subgroup of H, by the induction hypothesis S is a conjugate 
of S,, say S = S;(g) = (H\ T)(g), and hence S C 7(g), an F Sylow sub- 
group of G. Thus every F Sylow subgroup of G contains one and only one F 
Sylow subgroup of H, and every F Sylow subgroup of H is contained in at 
least one F Sylow subgroup of G. 

In particular, if H has an F Sylow subgroup S which is normal in G, it is the 
only F Sylow subgroup of H and hence is contained in every F Sylow subgroup 
of G. Thus in this case T is an F Sylow subgroup of G if and only if 7/S is an 
F Sylow subgroup of G/S. Furthermore, 7,/S and 7:;/S are conjugate in 
G/S if and only if 7, and 72 are conjugate in G. Hence we deduce the validity 
of our theorem in G from its validity in G/S, which we know from the induc- 
tion assumption. 

Assume, on the other hand, that H has an F Sylow subgroup S which is not 
normal in G. Let T be an F Sylow subgroup G such that S C 7; then, as was 
shown above, S = 7 (\H and hence S is normal in 7. Thus N4(S), the ¢ 
normalizer of S in G, contains any F Sylow subgroup 7 of G such that S C 7; 
and also V,(S) # G since S is not normal in G. 

Now let S;,..., S, be all the F Sylow subgroups of H. Then either k = | 
or F has finite order f and k = 1 modulo f. Consider all the F Sylow subgroups 
of G which contain S;. Since these are all contained in N,(S;,) there are a 
finite number of these, say 

Ee Pe 
furthermore, either »; = 1 or m;, is congruent to 1 modulo f (if m; > 1, F has 
finite order f) and the subgroups 

Ty, «2257, 
are all conjugate. The 7,‘ (¢ = 1,...,%; 7 = 1,...,m,) includes all F 
Sylow subgroups of G, since every F Sylow subgroup of G contains some F 
Sylow subgroup of H. Furthermore, they are all distinct since an F Sylow 
subgroup of G contains only one F Sylow subgroup of H. Thus the number of 
F Sylow subgroups of G is either 1 (if k = land m, = 1) orisequaltom, +... 
+n, =k =1 modulo f since each n; = 1 and k = 1 modulo f. Also for fixed 1 
the 7,‘ are conjugates since they are F Sylow subgroups of N,(S,), and 
T, is conjugate to 7,” since N,(S;) is conjugate to N,4(S,). Hence the F 
Sylow subgroups of G are all conjugate and if there is more than one, F has 
finite order f and their number is congruent to 1 modulo f. 


CorROLLARY 3.1. If the ¢ composition factor F of G is infinite, G has just one F 
Sylow subgroup and it is normal. 























SYLOW THEORY FOR OPERATOR GROUPS 199 


4. Some further theorems on Sylow subgroups. 


THEOREM 4.1. Jf Sis an F subgroup of G, S is contained in some F Sylow sub- 
group of G. 


Proof. We prove the theorem by induction on j(G). If 7(G) = 1,Gis M — @ 
isomorphic to F and hence there is nothing to prove. Assume that the theorem 
is true for all H such that j7(H) < m and that j(G) = m. If G is primary the 
theorem is obvious so we assume that G is not primary. Let NV be a minimal 
normal ¢ subgroup of G; we distinguish two cases: 

(a) G/N is not primary. NV + S/N is an F subgroup of G/N and hence by 
the induction hypothesis there exists an F Sylow subgroup K/N of G/N which 
contains N + S/N; furthermore, K # G, since G/N is not primary. Now S 
is an F subgroup of K; using the induction hypothesis once again we conclude 
that S is contained in an F Sylow subgroup P of K, and it is easy to see that P 
is also an F Sylow subgroup for G. 

(b) G/N is primary. If VN + S = G, S is already an F Sylow subgroup for 
G, since it follows from the fact that G is not primary that char(N) # F. 
If N + S #G, N + Sis contained in a maximal normal ¢ subgroup K of G; 
for it follows from the ¢ nilpotency of G/N that N + S isa ¢ link for G. Let 
P be any F Sylow subgroup for G, then by Lemma 4.1, P (\ K is an F Sylow 
subgroup of K. By the induction hypothesis, S is contained in some F Sylow 
subgroup of K and hence, since the F Sylow subgroups are all conjugates, is 
contained in some conjugate [P (\ K](k) of P (\ K. Hence S is contained in 
P(k). 


THEOREM 4.2. If H is a @ subgroup of G and P, and P, are two F Sylow sub- 
groups for H, they are not contained in the same F Sylow subgroup for G. 


Proof. lf P,; and P:; are both contained in the F Sylow subgroup S of G, 
then {P;, P2} is a @ subgroup of S and hence is primary with characteristic F. 
But {P;, P2} is contained in H and P; is an F Sylow subgroup for H; hence 
P, = P,. 


THEOREM 4.3. The @ normalizer N of an F Sylow subgroup P does not contain 
any conjugate of P distinct from P itself. Furthermore, N is its own @ normalizer. 


Proof. Assume that P’ = —g + P + g is contained in NV. Then P’ is an 
F Sylow subgroup for N and hence is conjugate to P in N so that there exists 
an element in N such that P’ = —n + P + n. But P is normal in N so that 
this implies P’ = P. Let K be the ¢ normalizer of N in G. If k& is in K, —k + 
p+kC —-k+N+hk = Nand hence -—k + P +h = P. Thus P is normal 
in K so that K = N. 


More generally we have: 


THEOREM 4.4. Jf H is a @ subgroup of G such that any ¢ composition factor 
of H has the same multiplicity for H as it does for G, then the @ normalizer of H 
is its own @ normalizer. 














200 CHRISTINE W. AYOUB 


Proof. Let N be the ¢ normalizer of H in G, and K the ¢ normalizer of V 
in G. Then if k is in K and H’ = H(k), -n + H’ +n = H’ for n in N; for 


—n+H'+n=-n-—k+H+k+n= —-k+ (k—-n—-—k) +H+ 
(ktn—k)+k=—-k+H+k 


since k + n — kisin N. Thus H and H’ are both normal in N and {H, H’} = 
H + H' has the same ¢ composition factors as H, since H’ has the same ¢ 
composition factors as H. Hence H = H’ and H is normal in K. Therefore, 
N = K and the theorem is proved. 


THEOREM 4.5. Let P, be the intersection of the F Sylow subgroups for G, P’, 
and P’’, and assume that if the @ subgroup S contains P;, S is contained in no 
more than one F Sylow subgroup for G. Then 

(a) The @ normalizers of P; in the F Sylow subgroups which contain it are all 
M — ¢ isomorphic. 

(b) The ¢ normalizer of P, in G is not equal to the ¢ normalizer of P, in any F 
Sylow subgroup containing P. 


Proof. Let P’, P”,..., P“ be the F Sylow subgroups (of G) which contain 


P,; and let N’, N”,..., N™ be the ¢ normalizers of P; in these groups. 
Since P, # P“ and P; isa ¢ link for P™ (¢ = 1,...,5), Pi # N™. Let N 
be the @ normalizer of P,; in G. Then N’ = P’(A\N,...,. NV‘) = PH) N., 


Furthermore, V‘® is an F Sylow subgroup for N; for otherwise NV“ is properly 
contained in an F Sylow subgroup Q of N, which in turn is contained in 
P® for some j # i. Thus we would have V“” C Q, and Q C P™ so that V“ 
is contained in both P‘® and P), which contradicts the hypothesis since 
P‘® and P™ are different. Hence the subgroups NV“ are F Sylow subgroups 
for V and so are M — ¢ isomorphic—which proves (a). 

Now assume that NV = N‘“® for some i. Then WN is an F group and N’ = 
N” =...= N™ so that N’ = N” © Pf\P” = P,, which is impossible. 
Hence (b). 


REFERENCES 


1. C. W. Ayoub, A theory of normal chains, Can. J. Math., 4 (1952), 162-188. 

2. ——— On the primary subgroups of a group, Trans. Amer. Math. Soc., 72 (1952), 450-466. 

3. R. Remak. Ueber minimale invariante Untergruppen in der Theorie der endlichen Gruppen, 
J. reine Angew. Math., 162 (1930), 1-16. 


Pennsylvania State University 




















ON THE DERIVATION ALGEBRAS OF LIE ALGEBRAS 
SHIGEAKI TOGO 


Let L be a Lie algebra over a field of characteristic 0 and let D(L) be the 
derivation algebra of L, that is, the Lie algebra of all derivations of L. Then 
it is natural to ask the following questions: What is the structure of D(L)? 
What are the relations of the structures of D(L) and L? It is the main purpose 
of this paper to present some results on D(L) as the answers to these questions 
in simple cases. 

Concerning the questions above, we give an example showing that there 
exist non-isomorphic Lie algebras whose derivation algebras are isomorphic 
(Example 3 in § 5). Therefore the structure of a Lie algebra L is not com- 
pletely determined by the structure of D(L). However, there is still some 
intimate connection between the structure of D(L) and that of L. 

Let L! = L D(L) = [So x Di:x, € L, D,; € D(L)} and define L"*") 
L™ D(L) inductively. Z is called characteristically nilpotent provided there 
exists an integer k such that L™ = (0) (4,p. 157). Then Z is characteristically 
nilpotent if and only if D(Z) is nilpotent and ZL is not one-dimensional (6, 
Theorem 1). As an analogue, we call L characteristically solvable provided D(L) 
is solvable and the centre of L is contained in [L, L]. Then characteristically 
nilpotent Lie algebras are characteristically solvable. It is known that D(L) 
is semi-simple if and only if L is semi-simple (5, Theorem 4.4) and that D(L) 
is nilpotent if and only if L is characteristically nilpotent or one-dimensional. 
In § 2, we shall show that D(ZL) is the direct sum of a semi-simple ideal and 
the radical if and only if either L is reductive or L is the direct sum of a semi- 
simple ideal, a characteristically solvable ideal and a central ideal whose 
dimension is at most one (Theorem 1). We also prove that D(L) is the direct 
sum of a semi-simple ideal and the nilpotent radical if and only if either LZ 
is reductive or L is the direct sum of a semi-simple ideal and a characteristically 
nilpotent ideal (Theorem 2). It is known that, as an algebraic Lie algebra, 
D(L) has the following structure: D(L) = 6$+%4%+M® with [S, 4M) (0) 
where S is a maximal semi-simple subalgebra, & is a maximal abelian sub- 
algebra of the radical consisting of semi-simple elements, and 9 is the ideal 
of all nilpotent elements in the radical (1, p. 144). If D(ZL) is especially the 
direct sum of ideals S, YW, and M, then either D(L) = © + MN or D(L) 

S + A where W is one-dimensional (Corollary 1 of Theorem 2). 
In §§ 1 and 3, we study the derivation algebra of L when L is the direct 


Received December 27, 1959. Research supported by the National Science Foundation, 
U.S.A, 


201 











202 SHIGEAKI TOGO 


sum of the ideals L; (¢ = 1,2,...,m). D(L) is the direct sum of a semi- 
simple ideal and the non-abelian nilpotent radical (resp. non-abelian nil- 
potent, reductive) if and only if D(Z,), for each i, is also; in the case that the 
dimension of the image of the centre of LZ in L/[L, L] is at most one, D(L) is 
the direct sum of a semi-simple ideal and the radical (resp. solvable) if and 
only if D(L,), for each i, is also (Theorem 4). In § 4 we show that if the nil- 
radical of L is characteristically solvable, then the radical of L is characteris- 
tically solvable and is the direct summand of L (Proposition 2). We also show 
some other properties of characteristically solvable Lie algebra (Propositions 
3, 4, and 5) and give some examples of such Lie algebras. 

Section 5 contains some remarks and the partial answers to the questions 
asked in the first paragraph (Theorems 5 and 6). 


1. Throughout this paper we denote by Z a Lie algebra over a field K of 
characteristic 0, by D(L) the derivation algebra of L and by Z(L) the centre 
of L. For any element x of L, the adjoint mapping ad x: y — [y, x] is a deriva- 
tion of L which is called inner. Given a subset M of L, we denote by ad M the 
set of all ad x with x in M. L is called reductive provided L is the direct sum 
of a semi-simple ideal and the centre Z(L). 

Let L be the direct sum of the ideals LZ; (¢ = 1,2,...,m). Let p,; denote 
the projection of Z onto L;. Let E(L) be the set of all linear transformations of 
L into L and let E(L,, L;) be the set of those of LZ; into L;. We shall identify 
an element 7;, of E(L;, L,) with an element p,7,, of E(L). Thus we have 
E(L,, L;) C E(L). Put D(L,, L;) = D(L) OO E(L,, L;). Then it is obvious that 
D(L,, L;) = D(L,). We prove the following 


LemMa 1. Let L be the direct sum of the ideals L; (i = 1,2,...,m). Then 


(1) D(L) = >> D(L,L,); 
1 


i, j= 


(2) For i ~ 7, D(L,, L;) consists of all elements T,; of E(L;, L;) such that 
LiTy, C Z(L;) and (Ly, LiJTi; = ©); 
(3) For i # j, D(Li, L;) is abelian and 


| Dw. LD) 2. Dita) | C D(L,, L;). 


k=l 


Proof. We shall first prove (2). Let D,, be an element of D(L,, L,) with 
i ~ j. Then, for x; in L;, and x, in L,, we have 


0 = [x,, x,|Di; = [x Di, x;). 


Therefore L,D;, C Z(L;). Furthermore, for elements x; and y, of L,, we 
have 


[xi VIDiz = [x Diz, yi) + [xs, YD) = 9, 


which shows that [Z,, L,|D;; = (0). Conversely, suppose that 7;, is an 





eee 





ele 


wl 


L. 


th 


ns 


hat 


an 








ON DERIVATION ALGEBRAS 203 


element of E(L,, L,;) satisfying the conditions in (2). Then it is immediate 
that 


(xe, 21) Ty = [xeT sy, x1] = (xe, x17 1,] = 0 


for all x, in ZL, and all x, in L,. It follows that 7,, is a derivation of L, that is, 
that 7,, belongs to D(L,, L,;). Thus (2) is proved. 


Let D be any element of D(L). Put T,, = p,Dp,. Then 


D = pm T 4 


i, j=l 


where 7,, belongs to E(L,, L,). It is easy to see that 7,, is a derivation of 
L, and that, for 1 # j, T,, satisfies the conditions in (2). Therefore it follows 
from (2) proved above that 7, belongs to D(L,, 1,). Thus we have 


D(L) C > DL, L,). 


i, j=l 


Since the converse inclusion is evident, we have 


D(L) = >> DL, L;) 


i, j=l 
and (1) is proved. 
(3) is evident. Thus the lemma is proved. 


Let D(L) denote the subalgebra of D(L) consisting of all elements D of 
D(L) such that Z D C Z(L). Then we have 


LemMA 2. Let L be the direct sum of the ideals L; (i = 1,2,...,m). Suppose 
that Z(L;) C [Ly, L,] for some j. Then 

(1) D(L,) is an abelian ideal of D(L); 

(2) [D(L,, L,), D(Ly, Ld] C D(L,) for all i # j. 


Proof. Let D,, be any element of D(L,). Then it is immediate that [L,, L,] 
D,, = (0) and therefore that Z(L,)D,, = (0). By using the fact that the centre 
of L, is stable under all derivations of L,, it is easy to see that D(L,) is an 
abelian ideal of D(L,). By Lemma 1 (2) it is clear that any element D,, of 
D(L,, L;) with i ¥ j satishes Z(L,)D,, = (0). Therefore it is immediate that 


| Bic.) > DL) + > DL, 1.) | = (0). 
i+j tek 


We can now use Lemma 1 (1) to conclude that D(L,) isan abelian ideal of D(L), 
and (1) is proved. 


For i # j, let D,, and D,, be any elements of D(L,, 
respectively. Then, by Lemma 1 (2), we have 


L,) and D(L,, L;) 














204 SHIGEAKI TOGO 


L [Di D;;\ = LD yD yj, c. Z(Ls)D 3 - (0), 
LAD i, D;;) = L;D3Di; C Z(LA)D i; *a Z(L;), 


which shows that [D,,, D,;] belongs to D(L,). Thus we have (2), completing 
the proof. 


2. In this section we determine the structure of the Lie algebra LZ such that 
D(L) is the direct sum of a semi-simple ideal and the radical. We begin with 


LemMa 3. Let L be a solvable Lie algebra such that Z(L)C [L, L]. If D(L) 
is the direct sum of a semi-simple ideal and the radical, then L is characteristically 
solvable. 


Proof. It is clear that L is not abelian. Write D(L) = S + R where S 
is a semi-simple ideal and ® is the radical of D(L). Since ad L is a solvable 
ideal of D(L), it follows that ad LC ®. Let D be any element of ©. Then 
ad LD = [ad L, D|] = (0) by hypothesis. Therefore L D C Z(L). Since 
Z(L) C [L, L] by hypothesis, it follows that D? = 0, which shows that all 
elements of S are nilpotent. By Engel’s theorem, © is nilpotent and therefore 
S = (0). Thus D(ZL) is solvable and L is characteristically solvable, com- 
pleting the proof. 


LemMMA 4. Let L be a non-abelian solvable Lie algebra. If D(L) is the direct 
sum of a semi-simple ideal and the radical, then D(L) is solvable, and L is either 
characteristically solvable or the direct sum of a characteristically solvable ideal 
and a one-dimensional central ideal. 


Proof. By virtue of Lemma 3, it suffices to prove the lemma when Z(L 


[L, L]. Let LZ; and Z be subspaces of Z(L) such that 
Z(L) = 1,+Z, Lif\[L, L] = (0), and Z C [L, L]}. 
Let L2 be a subspace of L containing [L, L] such that 
L=1,+ 1, Li (\ Lz = (0). 


Then it is clear that L; is a non-zero central ideal of Z and that Lz is a non-zero 
ideal of L such that Z(L2) C [Le, Le). 

By hypothesis, D(L) = S + ® where © is a semi-simple ideal and & is 
the radical of D(L). Write D(L2) = Se + Me with Se a semi-simple sub- 
algebra and 2 the radical of D(L2). Then, since Z(Lz) C [Le, Le], it follows 
from Lemma 2 (1) that ®, contains D(L2). Let D, be the identity derivation 
of L,; and let It be the space spanned by D,, D(Li, Le), D(Le, Li) and Rs. 
Then, by Lemma 1 (1), (3) and Lemma 2 (2), it is immediate that M is an 
ideal of D(L). We assert that MM is solvable. In fact, by Lemma 1 (3) and 
Lemma 2 (2), we have 


m? Cc Ro? + (D(Le) + D(Li, Le) + D(L2, L,)). 














ON DERIVATION ALGEBRAS 205 


Since R,™ = (0) for some integer k, it follows that 
M” C D(L2) + D(Li, Le) + D(L2, Li). 


By Lemma 2 we have M+” C D(L.). It now follows from Lemma 2 (1) 
that M+? = (0), that is, that M is solvable, as was asserted. Thus WM? is a 
solvable ideal of D(L) and therefore it is contained in R. Since S is a unique 
maximal semi-simple subalgebra of D(L), it contains Ss. Therefore [S2, R.] = 
(0), which shows that D(Lz) is the direct sum of a semi-simple ideal and the 
radical. Therefore we can use Lemma 3 to see that Zz is characteristically 
solvable and we see that 


M = (Dj) + D(Ly, L2) + D(Ls, L,) oe DLs). 


Furthermore, we assert that dim LZ, = 1. In fact, if dim Z, > 1, then 
D(L;) = Si + (D;) where S; is a non-zero semi-simple ideal of D(Z)). 
Therefore D(L) = S; + Mand [S;, Mt] = (0) by hypothesis. Let D,, be any 
element of S;. Then 


DaDu = [Da, Dy) = 0 


for any element Dz; of D(L2, L;). But, since ZL; is abelian and Ly # (0), it 
follows from Lemma 1 (2) that L2D(Lo, L,;) = L;. Therefore we have D,,; = 
0, whence S, = (0), which is a contradiction. Thus ZL; must be one-dimensional, 
as was asserted. We now see that D(L,) = (D;) and therefore that D(L) = M. 
Thus D(L) is solvable and the lemma is proved. 


We can now prove the following 


THEOREM 1. D(L) is the direct sum of a semi-simple ideal and the radical if 
and only if L is one of the following Lie algebras: 

(1) L is reductive; 

(2) L is the direct sum of a semi-simple ideal and a characteristically solvable 
ideal; 

(3) L is the direct sum of a semi-simple ideal, a characteristically solvable 
ideal, and a one-dimensional central ideal. 


Proof. Suppose that D(L) is the direct sum of a semi-simple ideal S and 
the radical R. Write L = S + R where S is a semi-simple subalgebra and R 
is the radical of L. Then it is clear that ad S and ad R are contained in S and 
® respectively. Therefore 


ad [S, R] = [ad S, ad R] = (0), 


from which it follows that [S, [S, R]] = (0). Since ad S is completely reducible, 
it follows that [S, R] = (0). Thus L is the direct sum of the ideals S and R. 
Since Z(S) = (0) and S = [S, S], by Lemma 1 (2) it is clear that D(S, R) = 
D(R, S) = (0). Therefore by Lemma 1(1) we have D(L) = D(S) + D(X). 
It now follows that ® is the radical of D(R) and therefore that D(R) = Sf) 
D(R) + R. Since S (\ D(R) is semi-simple as an ideal of S, D(R) is the 














206 SHIGEAKI TOGO 


direct sum of a semi-simple ideal and the radical. We can use Lemma 4 to see 
that R is abelian or characteristically solvable or the direct sum of a character- 
istically solvable ideal and a one-dimensional central ideal. Thus the necessity 
of the condition is proved. 


To prove the sufficiency of the condition, if L is reductive, write L = S + A 
with S a semi-simple ideal and A an abelian ideal. Then by Lemma 1 we have 
D(L) = D(S) + D(A). Since D(A) is the direct sum of a semi-simple ideal 
and the one-dimensional central ideal, so is D(L). If L is the Lie algebra as in 
(2), then D(ZL) is clearly the direct sum of a semi-simple ideal and the radical. 
If L is the Lie algebra as in (3), write L = S+ R + Z where Sisa semi-simple 
ideal, R is a characteristically solvable ideal, and Z is a one-dimensional central 
ideal. Then D(ZL) is the direct sum of the ideals D(.S) and D(R + Z), the latter 
being the radical of D(L) (cf. the fact that M is solvable in the proof of Lemma 
4). Thus the theorem is proved. 


As an immediate consequence of Theorem 1, we have 


CoROLLARY 1. D(L) is solvable if and only if L is characteristically solvable 
or one-dimensional or the direct sum of a characteristically solvable ideal and a 
one-dimensional central ideal. 


The following corollary is remarked in (6, § 3). 


CorROLLary 2. If D(L) consists of semi-simple elements, then L is a reductive 
Lie algebra whose centre is at most one-dimensional. 


Proof. Since the radical of D(L) consists of semi-simple elements, it follows 
from the structure theorem on algebraic Lie algebras (1, p. 144) that D(Z) 
is reductive. By Theorem 1 we see that L is reductive. If dim Z(L) > 1, then 
it is evident that Z has a non-zero nilpotent derivation. Therefore dim Z(L) < 
1, completing the proof. 


Let Do(L) = L, Di(L) = D(L) and let D,(L) be the derivation algebra 
of D,-1(L). Then we have the following corollary correcting (7, Theorem 4). 


COROLLARY 3. For any integers m, n > 0, D,(L) is reductive if and only 
if D,(L) ts reductive. Then all the D,(L)'s with n > 1 are completely reducible 
and isomorphic to each other. 


Proof. It follows from Theorem 1 that D,(L) is reductive if and only if 
D,-:(L) is reductive. Therefore the first part of the corollary is evident. If 
some D,,(L) is reductive, then all the D,(L) with n > 1 are completely 
reducible. Since the centre of D(L) is at most one-dimensional, it is im- 
mediate that all the D,(L)’s with m > 1 are isomorphic to each other, com- 
pleting the proof. 


In Theorem 1, if Z is not reductive, then the maximal semi-simple sub- 
algebra of D(L) is ad S with S the maximal semi-simple ideal of L. We here 








si 


tk 


I 
i 


ive 








ON DERIVATION ALGEBRAS 207 


note the following proposition which is an easy consequence of (5, Theorem 


4.3). 


PROPOSITION 1. Let S be a maximal semi-simple subalgebra of L. Let R be 
the radical of L and let I be the subalgebra of D(R) consisting of all derivations 
D of R which can be trivially extended to the derivation of L, that is, such that, 
by putting S D = (0), D is a derivation of L. Then ad S is a maximal semi- 
simple subalgebra of D(L) if and only if M is solvable. 


Proof. We identify an element of I? with the trivially extended derivation 
of L. Therefore 2? C D(L). Let D be any element of D(L). Then, as is well 
known, there exists an element x of LZ such that the restriction of D to S is 
equal to the restriction of ad x to S as the derivations of S into L. Put D’ = D 
— ad x. Then it is clear that D’ belongs to It, which shows that D(L) = ad L 
+ M. If we write M, = ad R + M, then it is immediate that M, is an ideal 
of D(L) and ad S/\ 9M, = (0). Let R be the radical of D(L). Then, since 
D(L)/M, is semi-simple, it follows that R is contained in M;. 

If M is solvable, then Pz, is solvable and therefore ad S is a maximal semi- 
simple subalgebra of D(L). Conversely, if ad S is such a subalgebra of D(L), 
then it is clear that dim ® = dim Mi. Since R C Mi, we have R = Ms. 
Therefore M is solvable, completing the proof. 


Before we state the second theorem, we prove 


Lemma 5. Let L be a non-abelian nilpotent Lie algebra such that Z(L) is not 
contained in [L, L|. Then D(L) 1s not nilpotent. D(L) actually contains a solvable 
non-nilpotent ideal. 


Proof. Let L; and Z be the subspaces of Z(L) such that 
Z(L) = 1,+2Z, LiC\[L, L] = (), and Z C [L, L]}. 


Let L2 be a subspace of L, complementary to L,; and containing [L, L]. Then 
it is clear that Z(L2) C [Le, Le]. Let D, be the identity derivation of L,; and 
let M be the space spanned by D,, D(Li, Ls), D(Ls, L:), and D(L2). We assert 
that M is a solvable non-nilpotent ideal of D(L). In fact, by Lemma 1 (1), 
(3) and Lemma 2 (2), we see that J is an ideal of D(L). It is obvious that 


M” C D(L2) + D(Li, Ls) + D(L:, Li). 


Therefore it follows from Lemma 2 that I = (0), that is, that M is solvable. 
By the hypothesis that Z is non-abelian and nilpotent, we have D(L;, L2) # 
(0). Since 


[D,, D(L, L2)] _ D(Li, I), 


it follows that M is not nilpotent. Thus M is a solvable non-nilpotent ideal 
of D(L), as was asserted. The proof is complete. 


We can now prove the following 














208 SHIGEAKI TOGO 


THEOREM 2. D(L) ts the direct sum of a semi-simple ideal and the nilpotent 
radical tf and only if either L is reductive or L is the direct sum of a semi-simple 
ideal and a characteristically nilpotent ideal. 


Proof. The sufficiency of the condition is immediate by Lemma 1. To prove 
the necessity, suppose that D(ZL) is the direct sum of a semi-simple ideal 
and the nilpotent radical. Then, by Theorem 1, we have that (1) L is reductive 
or (2) L is the direct sum of a semi-simple ideal S and a characteristically 
solvable ideal R, or (3) LZ is the direct sum of a semi-simple ideal S, a character- 
istically solvable ideal R, and a one-dimensional ideal Z. In the case (2), 
D(R) must be nilpotent and R is not one-dimensional. Therefore by (6, 
Theorem 1) R is characteristically nilpotent. It now suffices to show that the 
case (3) does not happen. If L is the Lie algebra in (3), then it follows from 
Lemma | that D(L) = D(S) + D(R + Z). Since D(R + Z) isa solvable ideal 
of D(L) by Theorem 1, it must be nilpotent by our assumption. Therefore 
R + Z is a non-abelian nilpotent Lie algebra. Then Lemma 5 tells us that 
D(R + Z) is not nilpotent, which is a contradiction. Therefore we cannot 
have the case (3). Thus the theorem is proved. 


Coro.uary 1. If D(L) is the direct sum of a semi-simple ideal and the nil- 
potent radical, then the radical of D(L) is either one-dimensional and consists 
of semi-simple elements or consists of nilpotent elements. 


Proof. This is immediate from Theorem 2 and the fact that V is a character- 
istically nilpotent Lie algebra if and only if all the derivations of N are nil- 
potent. 


COROLLARY 2. Let Rand N be the radical and the nil-radical of L respectively. 
Then the following conditions are equivalent: 

(1) D(L) ts the direct sum of a semi-simple ideal and the radical consisting 
of nilpotent elements; 

(2) R ts characteristically nilpotent; 

(3) N is characteristically nilpotent; 

(4) VN D(L)" = (0) for some integer n. 
If L satisfies one of these conditions, then R = N. 


Proof. (1) — (2) is an immediate consequence of Theorem 2. (2) — (3) is 
evident, since (2) implies that R = N. (3) — (4) is immediate by the fact 
that V is stable under all derivations of L. Therefore it suffices to prove that 
(4) — (1). Suppose that L satisfies the condition (4). Let S be a maximal semi- 
simple subalgebra of L. Then L = S + R. Since N(ad S)* = (0) and [R, S] C 
N, it follows that R(ad S)"*' = (0). Since ad S is completely reducible, we 
have R(ad S) = (0), that is, [R, S] = (0). Then, by Lemma 1, D(Z) is the 
direct sum of the ideals D(S) and D(R). It is obvious that D(S) is semi-simple. 
From the fact that R D C N for any D in D(R), it follows that R D(R)"*' = 
(0) and therefore that D(R) consists of nilpotent elements. Thus we see that 
(1) is satisfied by L. The proof is complete. 





3. 


direc 


TH 
D(L) 


follow 


(1) 
(2) 


(3) 
Pr 


(resp 
state 
for a 
supp 
j be 
[L;, 3 
our a 
that 


cond 
LE 
Pr 


from 
semi 
(6, 1 
W 
TI 
(1 


redu 


abeli 
If 
(4 
D(L 


(3 


proc 


ng 





ON DERIVATION ALGEBRAS 209 
3. This section is devoted to the study of D(L) in the case that L is the 
direct sum of the ideals. By using Lemma 1, we can first prove 


THEOREM 3. Let L be the direct sum of the ideals L; (i = 1,2,...,m). Then 
D(L) = D(L,) + D(L2) +... + D(L,) if and only if L satisfies one of the 


following conditions: 


(1) Z(L) = (0); 
(2) L = [L, LI}; 
(3) All the L,’s except one are such that Z(L,;) = (0) and L, = [L,, L;/. 


Proof. lf Z(L) = (0) (resp. L = [L, L]}), then it is clear that Z(L,) = (0) 
(resp. L; = [L;, Z,|) for all 7. Therefore, if one of the three conditions in the 
statement is satisfied by L, it follows from Lemma 1 (2) that D(L,, L,) = (0) 
for all i # 7. By Lemma 1 (1) we have D(L) = > ,.;"D(L,). Conversely, 
suppose that D(L) = > n:"D(L,). If Z(L) # (0) and L # [L, L], let i and 
j be respectively any integers such that Z(L,) # (0) and such that L, # 
IL, L,). lf i ¥ j, then by Lemma 1 (2) we have D(L,, L,) # (0), contrary to 
our assumption. Therefore i = j. This shows that there exists only one L, such 
that Z(L,) ¥ (0) and L, # [L,, L;], and that all the other L,’s satisfy the 
conditions Z(L,) = (0) and Ly, = [Zy, L,]. The proof is complete. 


Lemma 6. Jf D(L) is abelian, then L is one-dimensional. 
Proof. l{ D(L) is abelian, then we have 
ad[L, L] = [ad L, ad L] = (0), 


from which it follows that L* = (0). Then it is easy to construct a non-zero 
semi-simple derivation of L, whence L is not characteristically nilpotent. By 
(6, Theorem 1) we see that L is one-dimensional. 

We now prove the following 


THEOREM 4. Let L be the direct sum of the ideals L; (i = 1,2,...,n). Then 

(1) D(L) is reductive (resp. semi-simple) if and only if D(L,), for each i, is 
reductive (resp. semi-simple) ; 

(2) D(L) is the direct sum of a semi-simple ideal and the non-abelian nil- 
potent radical if and only if D(L,), for each i, is such a direct sum; 

(3) D(L) is non-abelian nilpotent if and only if D(L,), for each i, is non- 
abelian nilpotent. 

If dim (Z(L) + [L, L]/[L, L]) < 1, then 

(4) D(L) is the direct sum of a semi-simple ideal and the radical if and only i 
D(L,), for each i, is such a direct sum; 

(5) D(L) is solvable if and only if D(L,), for each 1, is solvable. 


Proof. (1) is immediate from Corollary 3 of Theorem 1, Lemma 1 (1), (2), 
and the fact that L is reductive (resp. semi-simple) if and only if L,, for each 
i, is reductive (resp. semi-simple). 

(3) is a consequence of (6, Theorem 6), but for completeness we write the 
proof in a slightly different way. Suppose that all D(L,)’s are non-abelian 














210 SHIGEAKI TOGO 


nilpotent. Then all Z,’s are characteristically nilpotent, whence we have 
Z(L,) C [Li, Li) for all «. Therefore L D(L,, L;)D(L,;, Ly) = (0) for all i, j,k 
such that 1 ¥ j, 7 # k. Let m, and /, be the integers such that 


D(L,)™ = (0) and L,!4) = (0), 


and let m be the maximal integer of all m,; and 1;,. By Lemma 1 (1), we have 


2m—1 


D(L)™ => DL)" +h > [... [[D(L)”, D(L,, L)], D(L,)), ... , D(L;)] 


p=0 i+j 


+...+ # [...[D(L,, L;), D(L;, Ly)], ..., D(Li, L,y)), 


t+, fk. ..., Le@ 
where D(L,)® means the identity transformation of L, into L,; for each i. 
It is clear that all the terms except the ones 


M = [... [[D(L,)’, D(L, L;)], D(L)],..., D(L;)| with t # j and p < m 


are equal to (0). But we have 


LRCLDL)”* CLP = ©), 


since 2m — p — 1 > m >1;. Therefore Jt = (0). Thus we see that D(L) 

(0). Since m > 1, D(L) is non-abelian nilpotent. Conversely, suppose that 
D(L) is non-abelian nilpotent. Then it is clear that all D(L,)’s are nilpotent. 
If some D(L,) is abelian, it follows from Lemma 6 that L, is one-dimensional 
and therefore from Lemma 5 that D(Z) is not nilpotent, contrary to our 
supposition. Therefore all D(L,)’s are not abelian. Thus (3) is proved. 


To prove (5), suppose that dim (Z(L) + [L, L]/[L, L]) < 1. Then either 
there exists only one suffix 7% such that 


Z(Lig) J (Liq, Ligl, 


or Z(L,) C [Lz, L,| for all 7. In the first (resp. second) case, let I be 


> DL, L,) + a DIL, (resp. > Dil, L,)). 


i+ j i+j 


For i  kand j # k, by Lemma 1 (2) and Lemma 2 (2) we have 


{= (0) if i #jand k # in, 
| C D(L,, L;) ifi #jand k = iy, 
[D(Ly, Lx), D(Le, L;)] 4 C D(Ly) if i = jand k = iv, 
1c Diy) if i = j = is, 
| = (0) ift7 = 7 A igandk # x, 


and 


[D(L,, Ly), D(L,) + D(L,)|] = (0) ifi # in and k & i. 











I 


ave 


j,k 


ave 


L;)] 


‘ah, 


h 4. 


m, 








ON DERIVATION ALGEBRAS 211 


In the first case, we have 


MC LY DLL) + DY DL) 
1, S10, #4 i=to 

and therefore Ji = (0). In the second case, we have J = (0). Thus M 
is a solvable subalgebra of D(L). Furthermore, by Lemma 1 (3) and Lemma 
2 (1), it is immediate that M is an ideal of D(L). Therefore, if D(L,) is solvable 
for each i, then >-,;"D(L,) is solvable. Since D(L) = > s"D(L,) + M, it 


follows that D(L) is solvable. The converse is evident and (5) is proved. 


To prove (2) (resp. (4)), let LZ be any Lie algebra (resp. a Lie algebra such 
that dim (Z(L) + [L, L]/[Z, L]) < 1). If D(L,), for each i, is the direct sum 
of a semi-simple ideal and the non-abelian nilpotent radical (resp. the radical), 
then it follows from Theorem 2 (resp. Theorem 1) that L,, for each i, is the 
direct sum of a semi-simple ideal S,; and the radical R; with D(R,) non-abelian 
nilpotent (resp. solvable). Put S = >> ,.:"S; and R = ¥ y:"R;. Then L is the 
direct sum of a semi-simple ideal S and the radical R. Then D(R) is non- 
abelian nilpotent by (3) (resp. solvable by (5), since it is clear that dim 
(Z(R) + [R, R]/[R, R]) < 1). By Lemma 1 we see that D(L) is the direct 
sum of a semi-simple ideal D(.S) and the non-abelian nilpotent radical (resp. 
the radical) D(R). 

Conversely, if D(L) is the direct sum of a semi-simple ideal and the non- 
abelian nilpotent radical (resp. the radical), then it follows from Theorem 2 
(resp. Theorem 1) that L is the direct sum of a semi-simple ideal S and the 
radical R with D(R) non-abelian nilpotent (resp. solvable). Then L,, for each 
i, is the direct sum of a semi-simple ideal S; and the radical R;, and we have 
S = Dw" S; and R = Y 41" R;. Therefore it follows from (3) that D(R,) 
is non-abelian nilpotent (resp. solvable) for all 7. By Lemma 1 D(Z,), for each 
i, is the direct sum of a semi-simple ideal D(S,;) and the non-abelian nilpotent 
radical (resp. the radical) D(R,). Thus (2) and (4) are proved. The proof is 
complete. 


We note that, by Lemma 6 and (6, Theorem 1), Theorem 4 (2) is equivalent 
to the following statement: D(L) is the direct sum of a semi-simple ideal and 
the radical consisting of nilpotent elements if and only if D(L,), for each i, is such 
a direct sum. 


4. In this section we show some properties and some examples of character- 
istically solvable Lie algebras. We first prove the following 


PROPOSITION 2. (1) Jf the radical of L is characteristically solvable (resp. the 
direct sum of a characteristically solvable ideal and a one-dimensional central 
ideal), then it is a direct summand of L. 


(2) If the nil-radical of L is characteristically solvable, then the radical of L is 
also characteristically solvable. 














212 SHIGEAKI TOGO 


Proof. Let S and R be respectively a maximal semi-simple subalgebra and 
the radical of L. If R satisfies the assumption in (1), then D(R) is solvable, 
whence the image of the restriction homomorphism of ad S into D(R) is semi- 
simple and solvable. Therefore the image is (0), which shows that [R, S 
(0), that is, that R is the direct summand of L, proving (1). 

To prove (2), suppose that the nil-radical N of L is characteristically solvable. 
Let S be a maximal semi-simple subalgebra of D(R). Then, since N is stable 
under all derivations of L, it is immediate that the image of the restriction 
homomorphism of S into D(.V) is equal to (0), which shows that V S = (0). 
Since R D C N for any D in D(R), we have R S* = (0). Since S is completely 
reducible, it follows that RS = (0). Thus S = (0), that is, D(R) is solvable. 
If R is not characteristically solvable, then by Lemma 4 we see that R con- 
tains a one-dimensional ideal Z as a direct summand. Therefore V contains Z 
as its direct summand, whence Z(.\V) Z [N, N], contrary to the characteristic 
solvability of V. Thus we conclude that & is characteristically solvable.The 
proof is complete. 





We remark that, in Proposition 2 (2), we cannot assert that the nil-radical 
of L is the radical of L, though it is true for characteristic nilpotence case 
(cf. Example 2). 

As a generalization of (6, Theorem 4), we prove 


PRopOSITION 3. Jf a Cartan subalgebra of L is characteristically solvable, then 
L is solvable. 


Proof. Let S and R be a maximal semi-simple subalgebra and the radical of 
L respectively. Then a Cartan subalgebra H of L is the sum of a Cartan sub- 
algebra H, of S and a subalgebra of R, and H, is a central ideal of H (2, 
Proposition 1). Therefore, if H is characteristically solvable, then we have 
H, = (0), whence S = (0), that is, LZ is solvable. 

We here remark that it is easy to construct a solvable Lie algebra which is 
not nilpotent and whose Cartan subalgebras are characteristically solvable. 


PROPOSITION 4. Let L be the direct sum of the ideals L,; (i = 1,2,...,n). 


Then L is characteristically solvable if and only if L;, for each i, is characteristically 
solvable. 


Proof. This is immediate from Theorem 4 (5) and from the fact that Z(L) C 
[L, L] if and only if Z(Z,) C [Li, L,| for all 2. 


Proposition 5. Let L be a Lie algebra which has no proper subalgebra whose 
derived algebra is equal to [L, L]. If |L, L| is characteristically solvable, then L 
is characteristically solvable. 


Proof. Let S be a maximal semi-simple subalgebra of D(L) and suppose 
that S # (0). Then there exists a non-zero semi-simple derivation D in G. 


Let H be the set of all elements of Z annihilated by D. Then H is an ideal of 











and 
able, 
emi- 


5] = 


able. 
able 
‘tion 
(0). 
tely 
ible. 
con- 
ns Z 
istic 


The 


lical 
case 








ON DERIVATION ALGEBRAS 213 


L containing [L, L], since we have [L, L] S = (0) by the characteristic sol- 
vability of [L, LZ]. There exists a non-empty subspace U of L such that 


L=U+H and UDCU. 
We assert that [U, H] = (0). In fact, let K be the algebraic closure of the 


basic field K, let LX = L @y K and U¥ = U@,K. As usual, we consider that 
U® C L¥ and we identify D with its extended derivation of L*. Let \ be an 
eigenvalue of D and let x be an element of U* corresponding to \. Then, for 
any element y of H, we have 
|x, y)D = [xD, y] = A[x, y] = 0. 
Since \ # 0, we have [x, y] = 0. Since U* is spanned by those elements x, it 
follows that [U*, H] = (0) and therefore [U, H] = (0), as was asserted. It 
now follows that 
({U, U], LZ) C ({U, LZ], VU) = ({U, U], VU) C [H, U) = ©). 
Therefore we have 
[L, L] = [U, U] + [H, A], 
where [U, U] is a central ideal of [L, L]. Since [L, L] is characteristically 
solvable, it follows that [U, U] C [H, H] and therefore that [L, L] = [H, H]. 
This contradicts our hypothesis since H is a proper subalgebra of L. Thus we 
see that S = (0), that is, that D(L) is solvable. By our hypothesis, Z cannot 
contain a central ideal as its direct summand. Therefore we conclude that L 
is characteristically solvable. The proof is complete. 
EXAMPLE 1. Let L be the Lie algebra over K with a basis x;, x2 such that 


[x1, x2] = x2, [x2, x1) = —x2. 
As is well known, L is a solvable Lie algebra whose derivation algebra is iso- 
morphic to L. Therefore Z is characteristically solvable. 


EXAMPLE 2. Let L be the algebra over K described in terms of a basis x, 


X2,...,%5 by the following multiplication table: 
[x1, X2] = Xo, [x1, x3] = Xz, [x1, x4] = 2x4, 
(x3, Xs] = 3x5, (x2, x3] = x4, [x2, Xo] = Xs. 
In addition [x,, x,] = —[x,,x,] and for i <j [x,,x,] = 0 if it is not in the 
table above. Then L is a solvable Lie algebra and [L, L] = (xo,... , x5). 
Let D be a derivation of L and let xD = > 2,5 A, x, (¢ = 1,2,..., 5). Then 
the matrix of D is 
0 Are Aus Aus Ais 
0 doe Aes Aus Ax 
0 oO Ass = Ae 0 
0 0 0 A22 + Aas —A12 
0 0 0 0 2r22 + Aas 

















214 SHIGEAKI TOGO 


Let [L, L] = (1, ..., ye) with yy = xu, (¢ = 1,..., 4). Let D’ be a deriva- 
tion of [L, LZ] and let yD! = ¥ yest wy, (¢ = 1,..., 4). Then the matrix of 
D’ is 


Mil Mi2 M13 Mi4 
0 22 M23 M4 
0 0 Mir + M22 M23 
0 0 0 2411 + wee 


Therefore Z and [L, L] are characteristically solvable Lie algebras. The nil- 
radical of L is [L, ZL] and there is no proper subalgebra of L whose derived 
algebra is equal to [ZL, L]. 


5. In this section, we summarize some obtained results and give some 
remarks as the partial answers to the questions stated in the beginning of the 
introduction. 


For the first question in the introduction, we have the following 


THEOREM 5. We have the following statements for the derivation algebras of Lie 
algebras: 

(1) An abelian derivation algebra is one-dimensional and consists of semi- 
simple elements ; 

(2) A non-abelian nilpotent derivation algebra consists of nilpotent elements; 

(3) A reductive derivation algebra is the direct sum of a semi-simple ideal and a 
one-dimensional ideal consisting of semi-simple elements; 

(4) A derivation algebra, which is the direct sum of a semi-simple ideal and a 
non-abelian nilpotent ideal, is the direct sum of a semi-simple ideal and an ideal 
which is another derivation algebra consisting of nilpotent elements. 


It would be interesting to know 

(1) whether or not there exists a characteristically nilpotent derivation 
algebra; 

(2) whether or not there exists a derivation algebra whose radical consists 
of nilpotent elements and is not a direct summand. 

In connection with (1), we note that there exists a characteristically solvable 
derivation algebra, for instance, the derivation algebra of the Lie algebra in 
Example 1. In connection with (2), we note that, if D(L) is such a derivation 
algebra of a solvable Lie algebra LZ, then Z must be nilpotent, L* # (0) and 
dim LZ > 6. In fact, it is clear that L is nilpotent. Write D(L) = S + MN where 
S is a semi-simple subalgebra and § is the radical of D(L). If LZ’ = (0), then 


there exists a subspace U of L such that 


L=U+L', Uf\L?=(0), and VSCU. 
Define a derivation D of L in the following way: 
xD=x for xin U, 
y D = 2y for yin L? 








riva- 
ix of 


mM i1- 





ON DERIVATION ALGEBRAS 215 


Then it is immediate that [D, S] = (0). Therefore D is a semi-simple deriva- 
tion of L which does not belong to S. Write D = D, + Dz with D,; in S and 
Dz in N. Let Dy = S; + N, be the Jordan sum decomposition of D,: S; and 
N, are respectively semi-simple and nilpotent derivations of L and [S;, N;] = 
0. Since [D, D,] = 0, we see that [D, S,] = 0 and [D, N,] = 0. Therefore 
D — S, is a semi-simple derivation of Z and [D — S,, N,;] = 0, which shows 
that D, = (D — S,;) + (—N)) is the Jordan sum decomposition of Ds. Since 
Dz is nilpotent, it follows that D — S, = 0, that is, that D = S;. Since © 
is splittable, D belongs to S, which is a contradiction. Therefore L* # (0). 
All the nilpotent Lie algebras whose dimensions are < 5 are determined in 
(3, Proposition 1). Therefore we can calculate the derivation algebras of those 
Lie algebras to see that their radicals contain non-zero semi-simple derivations. 
Thus dim LZ > 6. 


As for the second question in the introduction, we have the following 


THEOREM 6. Let D(L) be the derivation algebra of a Lie algebra L. Then: 

(1) D(L) is abelian if and only if L is one-dimensional; 

(2) D(L) ts non-abelian nilpotent if and only if L is characteristically nil- 
potent; 

(3) D(L) is non-nilpotent solvable if and only if either L is characteristically 
solvable and not characteristically nilpotent or L is the direct sum of a character- 
istically solvable ideal and a one-dimensional central ideal; 

(4) D(L) is reductive (resp. semi-simple) if and only if L is reductive (resp. 
semi-simple) ; 

(5) D(L) is the direct sum of a semi-simple ideal and the non-abelian nil- 
potent radical if and only if L is the direct sum of a semi-simple ideal and a 
characteristically nilpotent ideal; 

(6) D(L) is the direct sum of a semi-simple ideal and the non-nilpotent radical 
if and only if either L is the direct sum of a semi-simple ideal and a characteris- 
tically solvable ideal which is not characteristically nilpotent or L is the direct 
sum of a semi-simple ideal, a characteristically solvable ideal and a one-dimen- 
sional central ideal. 


Finally, we note the following example, which shows that non-isomorphic 


Lie algebras can have isomorphic derivation algebras: 


EXAMPLE 3. Let A;, A» be abelian Lie algebras such that dim A; # dim 4¢. 
Then D(A ,) is the direct sum of a semi-simple ideal S; and the one-dimensional 
ideal Z, (i = 1, 2). Let ZL, (resp. Le) be the direct sum of S, and A, (resp. 
S, and A). Then, by using Lemma 1, we see that D(L,) (resp. D(L2)) is the 
direct sum of ideals D(S:), S;, and Z,; (resp. D(S;), Se, and Z2). Since D(S,) 
is isomorphic to S, (i = 1, 2), it follows that D(L,) is isomorphic to D(L2) 
But LZ, is not isomorphic to Lz. 














216 SHIGEAKI TOGO 


REFERENCES 


1. C. Chevalley, Théorie des groupes de Lie, tome III, Act. Sci. Ind., no. 1226 (Paris, 1955) 
2. J. Dixmier, Sous-algébres de Cartan et décompositions de Levi dans les algebres de Lie, Trans 
Roy. Soc. Canada, Series I11, Section III, 20 (1956), 17-21. 





3. — , Sur les représentations unitaires des groupes de Lie nilpotents III, Can. J. Math., 10 
(1958), 321-348. 
4 


. J. Dixmier and W. G. Lister, Derivations of nilpotent Lie algebras, Proc. Amer. Math. Soc., 
8 (1957), 155-158 

5. G. Hochschild, Semi-simple algebras and generalized derivations, Amer. J. Math., 64 (1942), 
677-694. 

- G. Leger and S. T6g6, Characteristically nilpotent Lie algebras, Duke Math. J., 26 (1959), 
623-628. 

- S. T6g6, On the derivations of Lie algebras, }. Sci. Hiroshima Univ., Ser. A, 19 (1955), 71-77 


NI 


Northwestern University 
and 
Hiroshima University 








55). 


Trans. 














AN ENUMERATION PROBLEM RELATED TO THE 
NUMBER OF LABELLED BI-COLOURED GRAPHS 


Cc. ¥. LEE 


We will consider the following enumeration problem. Let A and B be finite 
sets with a and 8 elements in each set respectively. Let m be some positive 
integer such that m S a8. A subset S of the product set A X B of exactly n 
distinct ordered pairs (a;, 6;) is said to be admissible if given any a € A and 
b © B, there exist elements (a;, b;) and (a, 5;) (they may be the same) in 
S such that a; = a and 6, = b. We shall find here a generating function for 
the number V(a, 8; m) of distinct admissible subsets of A X B and from this 
generating function, an explicit expression for N(a, 8;m). In obtaining this 
result, the idea of a cut probability is used. This approach in a problem of 
enumeration may be of interest. 

One may consider A and B as (say) two chess teams competing with each 
other. NV (a, 8; 2) is then the number of ways of having m simultaneous matches 
between the two teams such that a player may be involved in several matches 
but there is at most one match between a pair of players and such that no 
player is left idle. 

In terms of graph theory, V(a, 8; ) is interpreted as follows. Consider a 
set of a + 8 labelled nodes of which a are in one colour and 8@ are in another. 
N (a, 8; m) is then the number of distinct 2-coloured graphs having exactly n 
branches on this set of nodes such that no node is allowed to be isolated. 

In reference (2), using Polya’s theorem (1), Harary obtained expressions 
for the number of bi-coloured graphs. His results differ from ours first in the 
respective methods of approach and second in the fact that his enumeration 
was for graphs with unlabelled nodes. 


THEOREM. Let F be a generating function for N(a, 8; n): 


a8 
F(x;a, 8B) = >> N(a, B;n) x". 
n=1 


Then, 


F(x;a, 8) = >> (-1)***"* (2) (1 — (1 + x)*)’. 


The idea of the proof goes as follows. We consider a certain bi-rooted graph 
G and define for G a cut probability P. This probability P can be computed 


Received January 19, 1959. 














218 c. % tan 


in two ways in one of which N(a, 8; ), except for sign, enters as certain co- 
efficients. The theorem is then proved by extracting certain coefficients of P 
and relating them to our enumeration problem. 

Let G be the bi-rooted graph shown in Fig. 1, 





Fic. 1. 


in which G has @ nodes next to the left root-node and 8 nodes next to the right 
root-node. There is one branch connecting each of the a nodes to the left root- 
node, one branch connecting each of the 8 nodes to the right root-node, and 
one branch between every a-node and every §-node. 

Consider now each branch as a piece of string and let 1 — g, 1 — r, and 
1 — s be respectively the probability that a left, middle, and right branch be 
cut in two (disconnected), and let us assume that the random variables (one 
for each branch) are independent. The cut probability P for the graph G is 
then the probability that G is cut in the sense that the left root-node and the 
right root-node are disconnected. 


LEMMA 1. The cut probability P for the graph G is given by 
P(q,7r,s)= >> (2) (1—gq)**¢(si-—r*+(- sy)’. 
k=0 


Proof. Break the a left branches into two sets L;, L2 of kand a — k elements 
in each and break the 8 right branches into two sets R;, Re of 7 and 8 — j 
elements in each. Let E,,; be the event that every branch in LZ» and in R: is 
cut and every branch in L, and R; is left uncut, and that the graph G is itself 
cut. Then 

Pr{E,;} = ¢( — g)="*s!(1 — s)*4(1 — )*. 


It then follows that 


3 
P(q,7,s) = a 2 (:)(*) Pri{Ex;} 


= () g‘(l —q)* “(si — r)*+ 1 - s))*. 


=() 


| 
Ms 
M 


~ 








an 


th 


nc 
ne 


SL 


co- 


of P 


ght 
0t- 
and 


and 
| be 
one 
7 is 
the 








AN ENUMERATION PROBLEM 219 


Lemma 2. Let f(r) be the coefficient of the term q* s* in P(q, r,s). Then 


(i) f(r) = Ds (—1)***" (2) (1 <. (1 s r)*)?; 
and 
(ii) Writing f(r) as 
f(r) =Cot+Cyr+...+ Cr, 
the coefficient C,, of r" is (—1)" N(a@, B; n). 


Proof. By expanding the expression P(g, r, s), (i) follows. To prove (ii), we 
note first that each middle branch defines a unique path from the left root- 
node to the right root-node. Therefore there are a8 distinct (not independent) 
such paths. Let D, be the event that the ith path is uncut. Then 


P(q,7r,s) =1—Pr{D,;\U D2:U...U Das} 
1— >) Pr{Dj + DL PriDi Nn D;} 


= sn > (1 PVD 1... C\ Bao. 


Now each sum can be expressed as 
> PriDiyNQ...ADu} = rg, 5) 
Tha sane tm 


where g(q, s) is a polynomial in g and s, and r" can appear nowhere else in the 
above expression for P(q, r, s). Let d(a, 8) be the coefficient of g* s* in g(q, s). 
Then the number d(q, 8), except for sign, is the number of distinct graphs in- 
volving a + 6 nodes and nm branches such that each graph satisfies the con- 
ditions given in the beginning. A check of sign yields d(a,8) = (—1)" 
V(a, 8B; n). Since d(a, 8) is just the coefficient of r” in f(r), the lemma follows. 


Proof of theorem. It follows from Lemma 2 that 
af 


> (-1)"N(a, B;n)r" = f(r). 


Hence, the generating function F is 
F(x;a, B) = > N(a, 8; n)x" = f(—x) 
n 


a 


EB i-yr (2) (1 — (1+ x)*)’, 


k=0 


and the theorem follows. 
As an example, we find for a = 3, 8 = 2 that 


6 


F(x; 3,2) = 6x* + 12x‘ + 6x°+ x 
so that V(3, 2;3) = 6, N(3, 2; 4) = 12, N(3, 2; 5) = 6, and NV(3, 2; 6) = 1. 











220 Cc. Y. LEE 


From this theorem, the expression for V (a, 8; m) can be derived explicitly. 
Let (x) denote the least integer greater than or equal to x, the following 
iteration of summations obtains. 


LEMMA 3. 
B kj kB 3 
> Lrin=-X ¥ LGD. 
j=0 i=0 i=0 jm<i/k> 


Using this lemma and expanding the generating function for N(a, 8; n), 
one gets 


COROLLARY. 


Ey Sut (EVO) 
} . = __ 1 )ttb+e+s 


The writer is indebted to J. Riordan for pointing out the following identity, 
which appears novel, and for other enlightening comments. In the above 
corollary, if we seta = 8 = n, then N(n, n, n) is the number of permutations 
on n objects. The following identity therefore obtains: 


EE cot (*)(")(") = a. 


REFERENCES 


1. G. Polya, Kombinatorische Anzahlbestimmungen fiir Gruppen, Graphen, und chemishe Verbin- 
dungen, Acta Math., 68 (1937), 145-254. 
2. F. Harary, On the number of bi-colored graphs, Pac. J. Math., 8 (1958), 743-755. 


Bell Telephone Labs., Inc. 
Whippany, N.J. 








licitly. 
lowing 


ntity, 
above 


ations 


‘erbin- 





PROGRAMMES IN PAIRED SPACES 
K. S. KRETSCHMER 


1. Introduction. The basic problem of linear programming is to minimize 
(or maximize) a linear function of a finite number of variables constrained by 
a finite number of linear inequalities. From a mathematical point of view the 
subject may be regarded as being divided into two areas. One is primarily 
analytical and deals with certain questions of duality and consistency. The 
other is algorithmic and is concerned with computational questions and 
methods. 

The analytical aspects of the problem were first discussed in (12). Later 
papers which expand on this work appear in (6) and (23). The principal com- 
putational technique used in linear programming was first discussed in (9). 
The results of some of the later research in this area are reported in (13). 

This paper is concerned with the duality and existence theorems of linear 
programming under more general conditions. The problem is discussed from 
the standpoint of paired linear spaces rather than that of locally convex 
spaces since the theory is slightly easier to develop from this point of view. 
It also has the advantage of containing certain aspects of previous work in 
this area (5; 10; 15) as special cases. In preparing this work the author has 
benefited greatly from the work in (5) and (10). The work in (15) appeared 
after the main part of the paper was completed. It should be mentioned that 
several theorems in this paper can be obtained from those in (5) and (10) 
if one modifies the topologies considered there in an appropriate manner. 

The main part of the paper starts with § 2 which contains a few lemmas 
which will be required later in the paper. In § 3 a programme and its dual are 
defined and several relations between these programmes are established. 
Section 4 is devoted to the form a programme can take in locally convex 
spaces. Section 5 contains several examples intended to illustrate some of the 
inherent complexities of the subject and to indicate how the theory can be 
applied in certain kinds of non-linear programming problems, continuous 
games, and Kantorovitch’s generalization of the transportation problem. 


2. Preliminary lemmas. It is assumed that the reader is somewhat familiar 
with the theory of linear topological spaces as developed in (3; 4). The only 


Received September 20, 1959. This work is based on an earlier paper which was accepted by 
the Carnegie Institute of Technology in partial fulfillment of the degree of Doctor of Phil- 
osophy. The research underlying the dissertation was effected under the careful guidance of 
R. J. Duffin and was sponsored in part by the Office of Ordinance Research, U.S. Army, 
Contract DA-36-061-ORD-490. Further research was supported by the Computation and 
Data Processing Center of the University of Pittsburgh. 

The author is indebted to D. Bratton for his incisive comments on a preliminary version 
of this paper. 


221 














222 K. S. KRETSCHMER 


departures from the terminology of these references is that ‘‘net”’ is used instead 
of ‘filter’, the word ‘“‘neighbourhood”’ means “open neighbourhood” and ‘‘equi- 
librée,”’ ‘‘tonneau,”’ and “‘espace tonnelé” are translated into “‘circled,” ‘“‘disk,” 
and “disk space,”’ respectively. The real numbers are denoted by R and the 
non-negative real numbers by Rp». Bilinear functionals are denoted: ((, )). 
If the linear spaces X and Y are in duality, the weak topology on X is denoted 
as w(X, Y) and the Mackey topology on X is denoted as s(X, Y). If Cis a 
cone in X, the set C+ is the set {y; y © Y and ((x, y)) > 0 for all x € C} and 
the set Ct* is the set {x;x € X and ((x, y)) > 0 for all y€ Ct}. 


LemMA 1. Let C, and C2 be convex cones in a linear topological space (X, T) 
and suppose that C, has a non-null Z-interior C.®. If (C, + C2)® denotes the 
Z-interior of Cy + Cz then (C, + C2)® C C, + C2. 


Proof. Since C,? is not empty, (C; + C:2)°® is not empty. If x» € (C, + C2) 
and x € C,° then there exists a 6 > 0 such that x» — 6x € (C; + C2)® and 
we may assume that 6 < 1. Suppose x» — éx = x; + x2 where x; € C; and 
xe € Cy. Then xo = x; + (x2 + dx) and consequently xg + dx € C,° (3, p. 
51, Proposition 5). Therefore xo € C,; + C,°. 


LEMMA 2. Let X and Y be linear spaces paired under ((,)) and C be a con- 
vex cone in X. Then C* is &-closed and C++ coincides with the T2-closure of C. 
ET, (Zz) is any topology on Y(X) which is compatible with the duality between X 
and Y. 


Proof. By (4, p. 67, Proposition 4) and the convexity of C+ and C++ it is 
sufficient to prove these statements for T, = w(Y, X) and T. = w(X, Y). 
It is immediate that Ct is w( Y, X)-closed and C++ is w(X, Y)-closed. Since 
CC C} the w(X, Y)-closure of C is contained in C++. Suppose x» is not an 
element of the w(X, Y)-closure of C. Since w(X, Y) is locally convex, there 
exists a y € Y which satisfies the condition that ((xo, y)) < 0 and ((x, y)) > 0 
for all x in the w(X, Y)-closure of C (3, p. 73, Proposition 4; 4, pp. 50 and 69). 
Consequently, there exists a y © C+ which satisfies ((xo, y)) < 0. Thus x, 
is not an element of Ct. 


LemMA 3. Let X and Y be linear spaces paired under ((,));, Z and W be 
linear spaces paired under ((,))2, and suppose that T is a linear transformation 
from X into Z. In order that T be w(X, Y) — w(Z, W) continuous it is necessary 
and sufficient that there exists a dual map T*. If T* exists, it is unique and 
w(X, Y) — w(Z, W) continuous. Furthermore, if T* exists T is s(Z, W) 
continuous. 


Proof. The first statement is proved in (4, p. 100, Proposition 1). The first 
part of the second statement follows from the fact that the bilinear functionals 
separate points. The second part is proved in (4, p. 101). The last statement 
is proved in (17, § 30.2). 





,. on eo &S 





PROGRAMMES IN PAIRED SPACES 223 


Lemma 4. Let X and Y be vector spaces paired under ((,)); and Z and W be 
vector spaces paired under ((,))2. Suppose that P is a convex cone in X which is 
w(X, Y)-closed, Q is a convex cone in Z which is w(Z, W)-closed, and T is a 
linear transformation of X into Z which is w(X, Y) — w(Z, W) continuous. 
Then (P (\ T-'(Q))* coincides with the w(Y, X)-closure of T*(Q*) + P* and 
hence with the closure of T*(Q*) + P* for any topology on Y which is com- 
patible with the duality between X and Y. 


Proof. 
(T*(Qt+) + Pt)* = {x; ((x, T*w + y)): > O for all w € Qt, » © Pt} 


= {x; ((Tx, w))2 + ((x, 9) > 0 for all w € Qt, y € PY}. 
It is clear that the vector spaces Z XK X and W X Y are paired under the 
bilinear functional ((,)) defined by 


(( (z, x), (w, y) )) = ((z, w))2 + ((x, »))1 for all 
x EX,y Y,2z € Z, and w € W. 


Moreover, (7*(Q+) + Pt)+ = {x; (( (Tx, x), (w, y) )) > O for (w, y) © Qt X 
P+}. Applying Lemma 2, we obtain that 


(7*(Q+) + Pt)+ = {x; (Tx, x) € QO XK P} 

= {x; Tx € Qandx € P} = T' (QO P. 
Consequently, (7-'(Q) (\ P)* = (T*(Qt) + Pt)**. The proof is completed 
by referring to Lemma 2 and (4, p. 67, Proposition 4). 


Lemma 5. In addition to the conditions of Lemma 4, assume that Q has a non- 
empty s(Z, W)-interior, 0°, and that P (\ T-'(Q°) is not empty. Then (P (\ T-' 
(Q))+ = T*(Q+) + Pt. In other words, T*(Qt) + Pt is w(Y, X)-closed and 
hence closed for any topology on Y which is compatible with the duality between 
X and Y. 


Proof. The proof is not diffieult if the null element of Z belongs to Q°. When 
this condition does not obtain, the ensuing argument applies. Lemma 4 estab- 
lished that 7*(Q+) + P+ C (P(\T—'(Q))*. The converse inclusion may be 
proved as follows. Suppose y€ (P\ T-"(Q))*. If y = 0 then ye T*(Qt) + Pt. 
if y #0 then x € P and Tx € Q implies ((x, y))1 > 0. By assumption there 
exists xo € P with 7x.€ @. It is easy to show that ((xo,¥)): > 0. If 
y¢7*(Q+) + Pt then r€ R, we Qt, and ry — T*we Pt implies r < 0. 
Equivalently, if B(r, w) = ry — T*w then (r, w) € R X Qt and B(r,w) € P 
implies (( (—1, 0), (r, w))) = —r+((0,w))2>0. Thus, (—1,0) is an element 
of the s(R X Z, R X W)-closure of B*(P) + ({0} XK Q), where {0} is the 
cone consisting of the real number zero and B* is the dual map of B, that is, 
B*tx = ( ((x, y))1, —7x) for all x € X. Consequently there exist nets {x.} C P 
and {z.}C Q such that {— 7x. + 22} s(Z, W)-converges to 0 and { ((x«, ¥))} 
converges to —l. 

If xa’ = 2( (x0, ¥))1%e + Xo and fq’ = 2( (xo, y))iZa then { Tx.’ — 2.'} s(Z, W)- 
converges to Tx» and {((xa’, y)):} converges to —((xo, y)):. Since Tx» € Q 











224 K. S. KRETSCHMER 


there exists an & such that 7x,’ — 2.’ € Q@ for a > &. The same reasoning as 
in Lemma 1 insures that 7x,’ € Q@° fora > &. In addition, x.’ € P so ((xa’, y))1 
> 0 for a > &. This implies that the greatest lower bound of the net { ((x.’, y)):} 
is non-negative. This is a contradiction since the net {x.’} was constructed 
so that the net { ((x.’, y)):} converged to — ((xo, y))1 <0. Thus y € 7*(Q*) + 
P+ whenever y € (P (\ T-'(Q))*. The proof is completed by referring to 
(4, p. 67, Proposition 4). 


LEMMA 6. In addition to the conditions of Lemma 4, assume that yo € Y, and 
zo © Z. Then a necessary and sufficient condition for x € P and Tx — 2€ Q 
to imply that ((x, yo))1 > Rk is that x € P, r€ Ro, and Tx — rzo€ Q implies 
((x, Yo))1 > rk. 


Proof. The sufficiency of the condition is obvious. The necessity of the 
condition is straightforward except for the case when r = 0. In this case it 
must be shown that if x € P and 7x — z € Q implies that ((x, yo)); > &, 
then x € P and Tx € Q implies that ((x, yo)); > 0. To prove this, assume 
that the latter condition is false. Then there exists an x; € P with Tx, € Q 
and ((x;, yo)):1 = 4 <0. Then for every » > 1, mx, € P, T(mx,) € Q, and 
((mx1, Yo))1 = nh <0. If x€ P and Tx — %€ Q then x + mx, also satisfies 
the same condition for m > 1. However, it is clear that ((x + mx,, yo)); — k 


cannot remain non-negative for all m > 1. Consequently the condition is also 
necessary. 


3. Programmes in paired spaces. Let X and Y be linear spaces paired 
under ((,)); and Z and W be linear spaces paired under ((,))2. A programme 
for these paired spaces is a quintuple (A, P, Q, yo, zo). In this quintuple it is 
assumed that A is a linear transformation from X into Z which is w(X, Y) — 
w(Z, W) continuous, P is a convex cone in X which is w(X, Y)-closed, Q is a 
convex cone in Z which is w(Z, W)-closed, yo is an element of Y, and 2p is 
an element of Z. The programme is said to be consistent if and only if there 
exists x € P such that Ax — 29 € Q. Such an «x is called feasible. If the pro- 
gramme is consistent, its value is defined as M = inf((x, yo)), (the infimum 
taken over feasible x). A feasible x is called extremal if and only if ((x, yo)); = 
M. The programme is said to be convergent if it is consistent, has a finite value 
M, and there is an extremal element. The programme is called subconsistent if 
and only if zo is an element of the w(Z, W)-closure of the set A (P) — Q. A net 
{xa} C P is called feasible if the programme is subconsistent and there exists 
a net {z.} C Q such that {Ax — 2} w(Z, W)-converges to zo. The net 
{Za} is called an associated net for {x.}. If the programme is subconsistent its 
subvalue is defined as m = inf lim-inf{((xe, yo))1} (the infimum taken over 
feasible nets {x.}). A feasible net {x.} is called extremal if and only if lim 
{( (xa, Yo))} = m. 

Associated with the programme (A, P, Q, yo, 20) is the programme (A*, Qt, 
— P+, —20, yo) for W and Z paired under 2((,)) and Y and X paired under 








PROGRAMMES IN PAIRED SPACES 225 


i((,)). The bilinear functionals 2((,)) and ;((,)) are defined by 2((w, z)) = 
((s, w))2 for all w € W and z € Z, and ;((,y x)) = ((x, y)); for all y © Y 
and x € X. The programme (A*, Q+, —P*, —2o, yo) is said to be the dual of 
the programme (A, P, Q, yo, Zo). It is also called the dual programme. Its value 
is denoted by M’ and its subvalue is denoted by m’. It is easy to see that the 
dual programme is well defined, consistent if and only if there exists w € Q* 


such that yo — A*w € Pt, and subconsistent if and only if yo is an element 
of the w( Y, X)-closure of the set A*(Q*) + P*. If the dual programme is 
consistent, then M’ = —sup ((zo, w))2 (the supreiaum taken over feasible w). 
If the dual programme is subconsistent, then m’ = —sup lim-sup { ( (zo, we))>} 
(the supremum taken over feasible nets {w.}). A feasible w is extremal if 
and only if ((zo, w))2 = —M’ and a feasible net {w,} is extremal if and only 
if lim {((20, Wa))} = —m’. 


Several elementary relations are given in the first theorem. 


THEOREM lI. 

(a) If a programme 1s consistent, it is subconsistent and M > m. 

(b) If the dual programme is consistent, it is subconsistent and M’ > m'’. 

(c) If a programme is subconsistent and has a finite subvalue, there always 
exist extremal nets. 

(d) If a programme and its dual are consistent then M and M’ are finite and 
M> —M’. 

(e) If & is feasible for the programme and w is feasible for the dual programme 
then ((#, yo — A*W)),; > Oand ((A# — 20, H))2 > O. If M = —M’ then Z and 
are extremal if and only if ((, yo — A*wW)); = 0 and ((A# — 2o, B))2 = 0. 

(f) If & is feasible for the programme, w is feasible for the dual programme, 
and ((Z, ¥v))1 = ((&0, B))2 then = and W are extremal. 

(g) If the dual programme is consistent and has a finite value M’ then an # 


which is feasible for the programme and satisfies ((Z, yo)); = —M’ is extremal. 
(h) If the programme is consistent and has a finite value M then a W which 
is feasible for the dual programme and satisfies ((zo, ))2 = —M is extremal. 


(i) The dual of the dual programme is equivalent to the programme itself. 


THEOREM 2. The programme (A, P, Q, Yo, 20) is consistent and has a finite 
value M if and only if the dual programme (A*, Q*+, —P*, —2Zo, Yo) is sub- 
consistent and has the finite number — M as its subvalue. 


Proof. Let T be the mapping: X XK R-— Z defined by 7(x,r) = Ax — rzo 
and let X X R and Y XR be paired under the bilinear functional ((, )) 
defined by (( (x, 7), (y, s) )) = ((x, y))1 + rs. By appealing to Lemma 6 and 
the definition of the value of a programme, we easily obtain that the pro- 
gramme (A, P, Q, yo, Zo) is consistent and has a finite value M if and only if 
(x,r) € P X Ro and T(x,r) € Q implies (( (x, 7), (vo, —M) )) > 0, and for 
every « > Othere existsanx € X with T(x,1) € Qand (( (x, 1), (vo, —(M + 
e)) )) <0. By Lemma 2 these latter conditions are satisfied if and only if 














226 K. S. KRETSCHMER 


(yo, ~M) is an element in the dual of the cone (P X Ro) (\ T—'(Q), and 
(yo, —(M + €) ) is not. By appealing to Lemma 4, we see that this is possible 
if and only if there exists a feasible net {w,’} and a net {ra’} C Ro with the 
property that the net {((zo, w.’))2 — ra’} converges to M and for every 
feasible net {w,} and net {72} C Ro, the net {((z0, wa))2 — ra} does not con- 
verge to M + «. This is clearly equivalent to the statement that the dual 
programme (A, Qt, —P*+, —2zo, yo) is subconsistent and has a finite subvalue 
m’ which is equal to — M. 


COROLLARY 2.1. The dual programme (A*, Q+, —P*, —20, yo) is consistent 
and has a finite value M’ if and only if the brogramme (A, P, Q, yo, 20) is sub- 
consistent and has the finite number — M' as its subvalue. 


Proof. Theorems 2 and I (i). 


COROLLARY 2.2. If K is a finite real number then the programme (A, P, Q, 
Yo, 20) has K as tts value and subvalue if and only if the dual programme (A*, Qt, 
— P+, —20, yo) has —K as its value and subvalue. 


Proof. Theorem 2 and Corollary 2.1. 


THEOREM 3. Let the programme (A, P, Q, Yo, 20) be consistent and have a 
finite value M. The dual programme (4*, Qt, —P*t, —20, Yo) 1s convergent and 
has —M as tts value, if the set G = {(A*wt+y, r — ((20,w))2); y € Pt, 

= QOt,andr € Ro} isw(Y XR, X x _ -closed. In particular, G is w(Y X 
R, X X R)-closed if Q has a non-empty s(Z, W)-interior Q° and there exists 
anx € P such that Ax — 2 € Q. 


Proof. Define T as in the proof of Theorem 2. Then G = 7*(Q+) + (P+ X 
Ro). The closure assumption on G and Lemma 4 assure that (yop, —M) € G 
and for every e > 0, (vo, —(M + €)) is not an element of G. Hence there 
exist y’ € Pt,w’ € Qt and r ey Ro such that A*w’ + y’ = yo and ((z, w’))2 — 
y’ = M. * Anaad y € Pt, we Q and A*w + y = yo implies that ((zo, w))» 
< M + whenever « > 0. It is clear that r’ = 0 and that w’ is extremal. The 
last statement in the theorem follows from Lemma 5. 


CorROLLARY 3.1. Let the dual programme ot Q*, —P*, —Zo, Yo) be consistent 
and have a finite value M'. The programme (A, P, Q, yo, 20) is — and 
has ae as its value if the set H = {(Ax phy” Yo))1); P,z€Q, 


andr € Ro} isw(Z X R, W X R)-closed. In particular H is w(Z R, W X R)- 
closed if P* has a non-empty s( Y, X)-interior (P*)° and there exists a w © Qt 
such that yo — A*w € (Pt)? 


Proof. Theorems 3 and I(i) 


THEOREM 4. Let the programme (A, P, Q : Yo, 20) be consistent and have a 
finite value M. Suppose the set U = {(Ax — 2, ((x, yo))1); x © P and z€ Q} 
has a non-empty s(Z X R, W X R)-interior ‘U ae le exists an ro € R 





nd 
ble 
the 
ry 
on- 
ual 
lue 


ent 


, 





PROGRAMMES IN PAIRED SPACES 


t 
to 
ba | 


such that (zo, ro) © U®. Then the dual programme (A*, Q*+, —P*, —20, yo) is con- 
sistent and has —M as its value. 


Proof. Let E = {r; (%0,r) € U} and F = {r; (go, r) € U!, where U is the 
w(Z X R, W X R)-closure of U. Clearly, M is the greatest lower bound of E, 
and either F has no lower bound, or m is the least element of F. Assume that 
F has no lower bound. Then (zo, r) € U for all r < ro. By definition of M 
we have M < ro so that 2M — ro — 2 < ro. Therefore (2o,2M — ro — 2) 
U. By (3, p. 51, Proposition 15), 4$(zo, ro) + 4(z0, 2M — ro — 2) € U®. Thus 
(zo, M — 1) € U®. This contradiction of the minimality of M shows that F 
has a lower bound m. Clearly m < M. Applying the same proposition once 
more, we see that ¢(Zo, ro) + (1 — t) (go, m) © U® for all 0 < t < 1. Conse- 
quently m = M. Corollary 2.2 insures that the dual programme is consistent 
and has a finite value — M. 


CorROLLARY 4.1. Let the dual programme (A*, Q+, —P*, —2Zo, yo) be consistent 
and have a finite value M’. Suppose the set V = {(A*w + y, ((Z0, w))2); we QO 
and y € Pt} has a non-empty s(Y XK R, X X R)-interior V° and there exists 
an ro © R such that (yo, ro) € V°. Then the programme (A, P, Q, Yo, 20) is con- 
sistent and has — M as its value. 


Proof. Theorems 4 and I(i). 


The next few theorems concerning the consistency of programmes follow 
easily from Lemma 6 and the preceding theorems. 


THEOREM 5. The programme (A, P, Q, yo, 20) is consistent if and only if the 
programme (A*, Qt, —P*, —2, 0) has zero as its subvalue. 


Coro.uary 5.1. The dual programme (A*, Q*, —P*, —20, yo) ts consistent 
if and only if the programme (A, P, Q, yo, 0) has zero as its subvalue. 


THEOREM 6. The programme (A, P, Q, Yo, 20) 1s subconsistent if and only i/ 
the programme (A*, Qt, —P*, —20, 0) has zero as its value. If A(P) — Q is 
w(Z, W)-closed, the programme (A, P, Q, Yo, 20) is consistent. 


CoroLuary 6.1. The dual programme (A*, Q+, —P*, —2Zo, yo) is subconsistent 
if and only if the programme (A, P, Q, yo, 0) has zero as its value. If A*(Q*) + 
P+ is w(Y, X)-closed, the dual programme is consistent. 


THEOREM 7. Let Q have a non-null s(Z, W)-interior Q®. Then there exists an 
x € Pwith Ax — zy € QO if and only if the programme (A*, Q*, —P*, —2o, 0) 


has zero as its value and the zero vector is the only extremal element. 


Proof. Necessity: If there exists an x € P with Ax — 2 € Q@ then w € Qt 
and w # 0 implies that ((Ax — 2, w))2>0. That is, ((x, —A*w)),; < —((zo, 
w))2. Consequently, w € Qt, w # 0,and —A* w€ P* implies — ((zo, w))2 > 0. 
Sufficiency: If Q@ is non-empty then the s(Z, W)-interior of A(P) — Q is non- 
empty. Thus if w € Qt, w # 0,and —A*w © P* implies ((—20, w))2 > 0, then 














228 K. S. KRETSCHMER 


Zo is in the s(Z, W)-interior of A(P) — Q. Because Q° is non-empty, Lemma 
3 insures that 2) € A(P) — Q. In other words, there exists an x € P with 


Ax — Zo € . 


COROLLARY 7.1. Let P+ have a non-null s(Y, X)-interior (P+). Then there 
exists aw € Q* with yo — A* wE (Pt)° if and only if the programme (A, P, 
Q, yo, 0) has zero as its value and the zero vector is the only extremal element. 


THEOREM 8. Let {2a} be a net in Z which w(Z, W)-converges to 29 and satisfies 
the condition that a2. > a, implies 


Za, — Sa, € Q. 
If the programme (A, P, Q, Yo, Za) is consistent then the programme (A, P, Q, 
Yo, Za) ts consistent for alla > a’. If the programme (A, P, Q, yo, 2a’) is consistent, 
has a finite value M,», and the programme (A, P, Q, Yo, 20) has a finite value My 
then for a > a’ the programme (A, P, Q, Yo, 2a) has a finite value M, which 
satisfies My > Ma > Mo. If denotes the w(X, Y)- or w(Z,W)-closure and 
A~"(Ua{za + Q}) D A~(Ualta + Q}) 
then lim M, = Mo. 
Proof. lf Ka = {x;x € P and Ax — 2 € Q} then 

Ko D Ka, D Ka, 

whenever a2 > a. Thus if Kq is non-empty then K, is non-empty for a > a’. 


If the programme (A, P, Q, yo, 20) has a finite value My then M, is a lower 
bound for M,. To prove that lim M, = Mo when the condition of the last 


statement is satisfied it is sufficient to show that U.K. = Ky. By definition 
of U.K, it is seen that 


UsKa = P (\ A*(Ualza + Q}) 
and that 
Ualta + Q} = 20 + Q. 
Thus 
Ko = P (1\ A~'(Ualza + Q}). 
Since A is w(X, Y) — w(Z, W) continuous, 
A (Walz + Q}) C A“(alza + Q}). 
By assumption the reverse inclusion is satisfied. Consequently UsKa = Ko. 
COROLLARY 8.1 Let {ya} be a net in Y which w(Y, X)-converges to yo and 
satisfies the condition that az > a, implies 
Yan ~ Ya, © P*. 
If the dual programme (A*, Q+, —P*+, —Zo, Ya) is consistent then the programme 
(A*, Q+, —P*t, —2Z0, Ya) is consistent for a > a’. If the programme (A*, Qt, 








d 





PROGRAMMES IN PAIRED SPACES 229 


— P*, —20, Ya’) is consistent, has a finite value M,-', and the programme (A*, Q*, 
—P*, —20, Yo) has a finite value M,' then for a > a’ the programme (A*, Qt, 
— Pt, —20, Ya) has a finite value M,' which satisfies M.' > M.' > My’. If 
denotes the w(W, Z)- or w( Y, X)-closure and 


A-"(Usya — P+) > A- (Una — Pt) 
then lim M, = Mo. 





Proof. Theorems 8 and 1 (h). 


4. Programmes in locally convex separated spaces. In this section a 
locally convex separated linear topological space will be abbreviated to 
lcs-space. We shall depart slightly from the notation of the previous sections 
and denote the dual of an Ics-space (EZ, T) by E*. If (Z, T) is an Ics-space it 
is clear that E and £* are paired under the bilinear functional ((, )) defined 
as ((e, e*)) = e*(e) and that E* and E are paired under the bilinear functional 
((,))1 defined as ((e*, e)); = e*(e). Consequently the theorems established 
in the preceding section apply in the case of Ics-spaces. 

It is also helpful to remember that if (U, T,) and (V, TZ.) are Ics-spaces 
then one can form a programme (A, P, Q, yo, 20) for the paired spaces X and 
Y and the paired spaces Z and W in four ways: 


(1) X=U, Y=U*, Z=V, W=V* 
(2) X=U, Y=U*, Z=V*, W=VJ; 
(3) X=U*, Y=U, Z=V, W=V*; and 
(4) X=U*, Y=U, Z=V*, W=V. 


It is assumed here and in the remainder of this section that an Ics-space and 
its dual are paired in the manner given in the previous paragraph. 

If (U, Z,) and (V, Ts) are disk spaces then s(U, U*) = T,, and s(V, V*) = 
Z.. Consequently, Theorem 3 and its Corollary insure the following. Case 
(1): if @ has a non-empty T,-interior Q° and there existsau € P with Au — 2 
€ @ then the dual programme is convergent and has value — M if the pro- 
gramme has M as its value. Case (3): if Q has a non-empty T,-interior Q° 
and there exists a u* € P with Au* — zy € @ then the dual programme 
is convergent and has value — M if the programme has M as its value. If P* 
has a non-empty T,-interior (P*+)® and there exists a v* € Q* with yo — A*o* 

(P+) then the programme is convergent and has value — M’ if the dual 
programme has M’ as its value. Case (4): if P* has a non-empty T,-interior 
(P+) and there exists av € Q* with yo — A*v € (P*)® then the programme 
is convergent and has value — M’ if the dual programme has M’ as its value. 


5. Examples. 5.1. The necessity of subvalues. lf U and V are finite dimensional 
Euclidean spaces and P and Q are the positive orthants (or, more generally, 











230 K. S. KRETSCHMER 


polyhedral cones) then one easily obtains from Theorem 3 the duality theorem 
of linear programming. Namely, a programme is consistent and has the 
finite number M as its value if and only if the dual programme is consistent 
and has the finite number — M as its value. Moreover, extremal vectors exist 
for both programmes. If P and Q are not polyhedral the result concerning the 
values is still true if both programmes are consistent. However, in this case it 
is quite easy to see that there needn't exist extremal vectors for consistent 
programmes which have finite values, nor need there exist extremal vectors 
for consistent dual programmes which have finite values. The following 
example demonstrates that in a Hilbert space setting both the programme and 
the dual programme may be consistent, and have finite values M and M’, 
respectively, which satisfy M+ M’ > 0. 

Let Lz denote the set of Lebesgue measurable functions on the unit interval 
which are square integrable and consider the problem of determining M = inf 
fi x g(x)dx + 2r subject to (1) gan element of L2 which is non-negative almost 
everywhere; (2) r € Ro; and (3) i g(x)dx +r >1 almost everywhere 
(0 < t < 1). It is easy to see that this problem is equivalent to determining 
the value of the programme (A, P, Q, yo, 20) for the spaces Lz X Rand Lz X R 
paired under ((,)); defined by 


el 
(((g1, 71), (g2, 72)))1 = J g1(x)go(x)dx + rire 


and the spaces Lz and Lz paired under ((,))2 defined by 


sl 
((g1, Z2))2 = j £1(x)go(x)dx. 


A is the linear transformation from Lz. X R —- Lz defined by 
1 
A(g,r)(t) = f g(x)dx +r (0 <t <1), P= PP, X Ro 
t 


where, Q = P1, yo = (h, 2) where h is the element of Le defined by A(t) = 
t (0 < t < 1), and 2p is the element of Le defined by 29(t) = 110 < t < 1). 

The dual programme is the programme (A*, Qt, —Pt, —zo, yo) for the 
spaces Lz and L2 and the spaces L2 X Rand Lez X R. A®* is the linear trans- 
formation from L:— ZL: X R defined by 


where P; consists of those elements of L2 which are non-negative almost every- 


t 1 
A*g(t) = (f g(x)de, f g(x)dx) (0<t<1),Q° =P,, 
0 0 


and —P* = (—P;) XK (—R»). 
The problem of determining the value of the dual programme can be written: 
determine M’ = —sup {} f(x)dx subject to (1) f an element of Ls which is 
non-negative almost everywhere, (2) |} f(x)dx < t almost everywhere (0 < 
t <1), and (3) fi f(x)dx < 2. 


























PROGRAMMES IN PAIRED SPACES 231 


It is not difficult to show that (1) M = 2 and M’ = —1; (2) the pair 
(g, 1) where g(t) = 0 (0 < ¢ < 1) is extremal for the programme; (3) the 
net {9,, 7,} where g, is defined by 9, (t) = ni*-' (0 < t < 1) and #, = 0 is ex- 
tremal for the programme and has the net {h,} where A, is defined by A,(¢) = 
0 (0 <# < 1) as an associated net; (4) f defined by f(t) = 1 (0 <t < 1) is 
extremal for the dual programme; and (5) the net {f,} where f, is defined by 
in(t) = 1 + nf*—' (0 < t < 1) is extremal for the dual programme and has 
the net {&,} where &,(t) = 0 (0 < ¢ < 1) as an associated net. 


5.2. Unattained infima and unbounded extremal nets. In 5.1 it was stated 
that extremal vectors always exist for consistent linear programming problems 
which have finite values. This example shows that a programme may be con- 
sistent and have a finite value which is not attained. Moreover, all extremal 
nets are unbounded in a sense to be defined. 

Let C denote the continuous real-valued functions on the unit interval and 
let BV be the set of all real-valued functions g on the unit interval which have 
bounded variation and satisfy the normalizing conditions: g(0) = 0 and 
g(t +0) = g(t) (0 < t < 1). Then the spaces BV and C are paired under the 
bilinear functional ((,)) defined by ((g,f)) = [> f(Odg(t). Let P be those 
elements of BV which are non-decreasing and define the linear transformation 
from BV —R by A(g) = fie dg(t). Let yo be the element of C defined by 
yo(t) = PCO < t < 1), 20 be the unit element of R (that is, the real number 
one), and denote the cone consisting of the real number zero by {0}. 

The programme (A, P, {0}, yo, 20) for the paired spaces BV and C and the 
space RF paired with itself is consistent and has zero as its value and subvalue. 
There exist no extremal elements, the sequence {9,} where 9, is defined by 
9,(t) = 0 (0 < t < 1/m) and 9, (t) = n(1/n < t < 1) is an extremal sequence 
and every extremal sequence is unbounded in the sense that the total variation 
of the elements of the sequence increase without limit. Specifically, the value 
of the programme is the inf f} t?dg(t) subject to g a non-decreasing normalized 
function of bounded variation which satisfies the condition: fi tdg(t) = 1. 

If g is a feasible vector it is clear that v. tdg(t) > 0. Consequently, the 
value of the programme is non-negative. Moreover, by Schwarz’s inequality: 


1 1 el sl sl 
f + 1)dg(t) < f t*dg(t) | f t'dg(t) + 2 j t dg(t) +f ai |. 
0 0 0 et 0 
Thus 
el J sl 
J t*dg(t) I f t'dg(t) +1 + | agi) | >1 
0 0 0 


and consequently fi t?dg(t) > 0. The elements of the sequence {9,} defined 
above are non-decreasing, normalized functions of bounded variation which 
satisfy the conditions: fie dg(t) = 1 and fi t?dg,(t) = 1/n. Thus the value 
of the programme is zero and it is not attained. Suppose {g,} is any feasible 














232 K. S. KRETSCHMER 


sequence. Then £ t dg,(t) converges to 1 and fi dg, (t) > 0. Hence the sub- 
value of the programme is also zero. If {g,} is extremal then {} ¢ dg,(t) con- 
verges to 1 and {} #*dg,(t) converges to 0. By Schwarz’s inequality we deduce 
that 


1 1 sl sl el 
f t’dg,(t) lf t'dg, (t) + 2f t dg, (t) +f dg,(t) — | > j t dg, (t). 
0 0 0 0 


As n increases without limit the right side approaches 1 so the limit-infima of 
the left side must be greater than or equal to 1. Since fi tdg,(t) converges to 
zero and = t dg, (t) converges to 1 it is clear that {> dg, (t) must increase without 
limit. Since g, is non-decreasing fi dg, (t) is the total variation of g, on the 
unit interval. This establishes that the total variation of the elements of every 
extremal sequence increases without limit. 

It is relatively easy to modify this example to obtain a programme which 
is not consistent but is subconsistent and has a finite subvalue. This implies 
that there exists a programme (namely, the dual programme) which is con- 
sistent and has a finite value, but whose subvalue does not exist. A problem 
similar to this example arises in statistics (7, 16, 19) and may be handled in 
the same way. 

5.3. Discontinuity of programmes. Theorem 8 pertains to a kind of con- 
tinuity property of programmes. This example shows that it is possible to 
have a sequence of programmes (A, P, Q, yo, 2) (w = 1, 2,...,) each of which 
is convergent and has unity as its value. Moreover, {z,} converges to 0 in a 
decreasing manner, (that is, 2 — 24: € Q) and the limit programme (A, P, 
Q, yo, 0) has zero as its value. 

Let C and BV be defined as in the previous example and let P be those ele- 
ments of C which are non-negative. For m > 2, let z, be the element of C 
defined by 2,(¢) = (0 < t < 1/n) and z,(t) = (1 — t)/(m — 1)(1/n < t < 1). 
Let A be the linear transformation from R into C defined by (Ar) (t) = rt(0 < 
t < 1). It is clear that for m > 2 the programme (A, P, R, 1, z,) for the paired 
spaces C and BV and the space R paired with itself is consistent and has 
unity as its value. The sequence z, converges monotonically to 0 and the 
programme (A, P, R, 1,0) for the same spaces is consistent and has zero as 
its value. 

5.4. Haar’s extension of the Minkowski-Farkas lemma. The Minkowski- 
Farkas lemma may be stated as follows: ‘‘Let A be an m X m matrix and let 
A* be its transpose. Let P, denote the positive orthant of Euclidean n-space 


E, and let u* = (u*;,...,u*,) be an element of Euclidean m-space. If 
Au P,, implies 

m 

¥. uu*>0 

i=1 


then there exists av € P, such that A*v = u*. 
This lemma follows easily from Lemma 11 when one recognizes that the 





gg 
— - 





PROGRAMMES IN PAIRED SPACES 233 


image of a polyhedral cone under a continuous linear transformation is itself a 
polyhedral cone and is therefore closed. 

It does not seem to be well known that the Minkowski-Farkas lemma was 
extended about thirty-five years ago by A. Haar (14) to cover a more general 
case. In particular the following theorem was established : ‘Let go, g:, . . . , g, be 
a finite collection of linearly independent functions of bounded variation on the 
unit interval. Define the linear transformation A mapping C, the continuous real- 
valued functions on the unit interval, into E, by Af = (f} f(dgr(t), alee 

* f(t)dgn(t)). If Af € Py implies Si FWdgo(t) > 0 then there exists a vector 
(uy,..., Mn) € Py such that ¥ 21"ugi = go in the sense that {} f(t)d(Xug,) (0) 
= {i} f(t)dgo(t) for all f € C.” The proof given above for the Minkowski- 
Farkas lemma applies in this case if we assume that the functions go, gi, . . . , g» 
are normalized as in 5.2. 

It is now evident that the following theorem is true and contains both the 
Minkowski-Farkas lemma and Haar’s extension as special cases. Let (U, T) 
be an Ics-space and let u*» be an element of the dual space (U, T)*. Let A 
be a continuous linear transformation from U into £, and let A* denote the 
dual map. Let P, denote the positive orthant of Z,. If Au € P, implies u*»o(u) 
> 0 then there exists a u* € (U, T)* such that A*u* = u*>. 

Duffin has suggested a similar theorem which is proved using a slightly 
different argument. Let P be a closed convex cone in an Ics-space (U, T) and 
let u*,,...,u*, be m elements in the dual space (U, T)*. If for every u © P 
there exists at least one u*,(¢ = 1,...,) such that u*,(u) > 0 then there exist 
n non-negative real numbers a;,...,@, not all zero such that >> ,;_;"amu*, © Pr. 

The proof is accomplished by introducing a continuous linear transformation 
T which maps U X R into U X E,. Suppose A(u,r) = (r — u*;(u),..., 
r — u*,(u)). The assumptions imply that if the transformation 7 is defined 
as T(u,r) = (u, A(u,r)) then T(u,r) € P X P, implies (( (u, 7), (0,1) )) > 
0. The bilinear functional ((, )) is defined on the Cartesian product (U & R) X 
(U* & R) as (( (u,r), (u* s))) = u*(u) + rs. Lemma 11 insures that (0, 1) 
belongs to the weak-star closure of 7*(P* X P,). Therefore, there exist se- 
quences {u*)}C Ptand fa = (a,;,,...,a,)}C P, such that the sequence 
{u* ay — Dd sar” a; u* ,} converges to the zero element of (U, T)* in the weak- 
star topology and {>> ,.:" a;“)} converges to unity. The non-negativity of the 
a,;™ and the convergence of {>> ,.;" a,“ } implies that the sequence {a,“} con- 
verges for each i, to say d;. Thus {> ,2:" a,;u*,;} converges in the weak-star 
topology to > ;.:"4*,; and consequently so does {u*,)}. Therefore > ;~:" 
au*, € P+. Clearly 4; > 0 and > j2;" 4d; = 1. 

5.5. A non-linear problem. Consider the problem of determining the 
maximum value of ar + bt subject to r and ¢ real numbers which satisfy 
(r? + 6)! <1. It is easy to see that this maximum exists and is equal to 
(a? + 5?)!. The extremal points are # = a/(a? + 6*)!, f = b/(a? + 6°). It is 
not obvious that this problem can be expressed as the problem of determining 
the value of a programme. A proof of this fact follows. 














234 K. S. KRETSCHMER 


Let E, be Euclidean 2-space. Then a point (r,¢#) € Ey» satisfies the con- 
dition (r? + #)' <1 if and only if |ru + | <1 for all (u,v) € Es which 
satisfy the condition (u? + v*)! < 1. Let S denote the unit circle, C(S) denote 
the set of all real-valued functions which are continuous on S, Q be the non- 
negative elements of C(S), and denote C(S) K C(S) by C2(S). Let M(S) denote 
the set of all regular countably additive real-valued set functions defined on 
the Borel sets in Sand denote M(S) K M(S) by M2(S). Then C.(S) and M2(S) 
are paired under the bilinear functional ((,)) defined by (( (f, g), (4, &) )) = 
S af (x)h(ds) + J ag(s)k(ds) (11, p. 265, Theorem 3). Let A be the linear trans- 
formation from Ez into C2(S) defined by [A(r, #)] (u,v) = (ru + tv, —ru — 
tv). Let 6 be the element of C(S) defined as 6(u, v) = 1 ((u,v) € S). 

The negative of the value of the programme (—A, Ex, —Q, (—a, —8), 
(—é, —é)) for the space Ez paired with itself and the paired spaces C:2(S) 
and M,(S) is equal to the maximum value of ar + 6¢ subject to r and ¢ real 
numbers which satisfy (r? + #7)! < 1. It is interesting to note that the problem 
of determining the value of the dual programme is equivalent to determining 
the minimum variation of those elements g of M(S) which satisfy the conditions 
fsug(d(u, v)) = a and Ssvg(d(u, v)) = 6. The solution to this problem is the 
measure g on S defined as g(F) = (a? + 5°)! if F is a Borel subset of S con- 
taining the point (a/(a? + b?)?, b/ (a? + 5*)*) and g(F) = Oif the Borel subset 
F of S does not contain the point (a/(a? + b?)?, b/(a? + 6?)'). The variation 
of 7 is (a? + 5*)}. 

It is clear that the same method can be applied to convert similar types of 
non-linear programming problems in Euclidean and other Banach spaces into 
programmes in paired spaces. 

5.6. Continuous games. It is well known that one can use the duality theory 
of linear programming to prove that every finite game has a value (12, p. 326). 
Continuous games on the unit square have been considered from the same 
point of view (24, § 10) although a complete proof of Ville’s theorem based on 
this approach has not appeared in the literature. It will now be shown that 
this can be done using the duality theory of § 3. This application of program- 
ming theory was suggested to the author by Duffin. 


Ville’s theorem. Let K be a continuous function on the unit square. Let G 
denote the set of all real-valued functions g on the unit interval which are non- 
decreasing and satisfy g(0) = 0 and g(1) = 1. Let 


1 sl 
Vig, h) = f K(r, t)dg(r)dh(t), 
Vi = min max V(g,h) and V2 = max min V(g, h). 
heG g¢G geG heG 
Then V; = vz. 


Proof. We may assume without loss of generality that K is strictly positive. 
It is clear that V, and V2 exist and satisfy the relation V2 < Vj. It is not diffi- 
cult to show that V,; < V2 if there exist go, Ao © G and a real number v such 








G 
on- 





——— QQ 


PROGRAMMES IN PAIRED SPACES 235 


that f} K(r, t)dgo(r) > 0(0 < t < 1), and f} K(r, t)dho(t) < 00 <r <1). It 
will now be shown that go, 4o, and v can be constructed from the extremal vec- 
tors of a certain programme. 

Let C, BV, and P be defined as in 5.2 and let Q be the non-negative elements 
of C. Define the linear transformation A from BV into C by (Ag)(t) = 5 
K(r, t)dg(r) (0 < t < 1). Let yo be the element of C defined by yo(t) = 1 
(0 <t <1). The value of the programme (A, P, Q, yo, yo) for the paired 
spaces BV and C and the paired spaces C and BV is equal to the inf {} 1 dg(t) 
subject to the conditions that g € P and ff K(r, t)dg(r) > 1(00 < t < 1). The 
topology s(C, BV) is the topology induced by the norm on C defined as 
Ilf|| = maxocg:<: |f(4)|. This follows from the fact that C together with this 
norm is a Banach space. It is clear that Q has a non-null s(C, BV)-interior. 
In fact any element of C which is strictly positive is in the s(C, BV)-interior 
of Q. 

Since K is strictly positive there exists a g € P such that fi K(r, t)dg(r) 
1(0 <t < 1), that is, Ag — yo belongs to the s(C, BV)-interior of Q. Theorem 
3 insures that the programme (A, P, Q, yo, yo) has a finite value M and that 
the dual programme has — M as its value. Moreover, both programme have 
extremal elements, say g; and h,. Since g, © P and 7 K(r, t)dgi(r) > 1 it is 
clear that M > 0. The functions go = (1/M)gi, ho = (1/M)h, and the real 
number v = 1/M satisfy the conditions in the first paragraph of the proof. 

It is interesting to note that Theorem 3 would not be applicable if the 
theorem were stated using the w(C, BV) topology. For in this topology the 
set QO does not have an interior (11, p. 265, Corollary 4). 

5.7. A generalization of the transportation problem. The transportation problem 
may be expressed as determining the minimum value of 

mi m2 
Cig ey 


i—1l j=l 


subject to the conditions that the x,,; are non-negative, 


m2 mi 
> xy = ai =1,...,m1), and > xy = b; (j =1,..., mz). 
j=l 


i=l 


[It is assumed that c,;, a;, and 6, are real numbers and that a; and b, satisfy 


m2 


mi 
ye a, = p b, %2>0, and b,>0. 
i=1 j=l 


A continuous analogue of this problem has been treated in the literature by 
Kantorovitch (18). The example presented below is similar but more general. 

Let (F,, T) and (F:, T2) be compact Hausdorff spaces and let (/3, T,) be 
their topological product. Let B,; denote the smallest o-field of subsets of F;, 
which contain the closed subsets of (F;, &,), H, denote the set of all regular 
countably additive (finite) real-valued set functions on B,, C, denote the set 
of all continuous real-valued functions on (/;, T,), QO; denote the non-negative 
elements of C;, P; denote the elements of H,; which are non-negative on B,, 














236 K. S. KRETSCHMER 


and O, denote the cone consisting of the zero element of H,. For g © H; define 
the set functions g; on B, and gz on Bz by g; (Gi) = g(Gi X F:2) and g2(G2) = 
g(Fi X G2). Let A be the linear transformation from H; into H, X H: de- 
fined by Ag = (g1, go). Let p; be an element of P;, p2 be an element of P, 
which satisfies p2(F2) = ~i(F:), and ¢ € C;. Then H;, and C;, are paired 
spaces (11, p. 265, Theorem 3) and P;*+ = Q,. 

The programme (A, P3, O; X Oso, c, (p1, 2)) for the paired spaces H; and 
C; and the paired spaces H, X Hz and C,; X Cz and its dual aré convergent 
and the values satisfy M + M’ = 0. Specifically, 


M = inf j c(r, t)g(d(r, t)) 


Ei 


subject to g € P3, g: = pi, and ge = po; and 


M' = -sup| | fiydhy +f fuihs| 
Bi Ee 


subject to f, € C, and f;(r) + fe(t) < c(r, t)( (r,t) € Ey X Ez). The formulation 
of the dual programme uses the fact that the dual map A* of A is defined by 
A*(f:, fo) (r, 2) = filr) + fe(t). The statements about the values and the 
convergent nature of the programme and its dual follow from Corollary 3.1. 

That this result implies the familiar theorem about the transportation 
problem is seen by letting F; = {1,...,m,} F. = {1,..., me}, T, be the 
set of all subsets of F; and the null set, and T. be the set of all subsets of F, 
and the null set. 

Kantorovitch did not consider the dual programme as such, although he 
did recognize that the variables of the dual programme were of importance 
in proving whether or not a feasible g was extremal. In particular he proved a 
special case of the following theorem. 

If (F,, T1) = (Fo, T2), ¢ € Qs, and c(r,r) = 0 for all r F2, then a g 
which is feasible for the programme is extremal if and only if there exists an 
h € C2 which satisfies |h(r) — h(t)| < c(r, t) for all r,t € Fe and hA(r) — h(t) 
= c(r, t) whenever g(V, X N,) > 0 for all neighbourhoods NV, of r and NV, of t 
Kantorovitch presented an analytical proof which is valid when (F2, T.) isa 
compact metric space. The proof presented below is based on duality theory 
and is valid under the conditions specified. 


Proof. Let go be an extremal element for the programme. It is not difficult 
to show that 
M = min | c(s)(g + k)(ds) (s = (r,t)) 
B3 
subject to g,k € P3, g1 — ky = pi and ge — ke = po. Moreover, (go, 0) attains 
the minimum. If the dual of the programme corresponding to this problem is 
constructed it is seen that M is also equal to the 


maxl f fidp + f fuip, | 
©’ E2 ’ Es 





—_ 





tion 
| by 
the 
3.1. 
tion 
the 
f F, 


| he 
ince 
ada 


ag 
san 
h(t) 
of t 
is a 


ory 


~ult 


ins 
n is 








PROGRAMMES IN PAIRED SPACES 237 


subject tof;, fe € Ceand |f,(r) + fe(t)| < c(r, 2) forallr,t © Fs. Since c(r,r) = 
0 for all r € Fy, it is clear that if the maximum is attained for f*, and f*s, 
then f*,(r) + f*2(r) = 0 or f*2(r) = —f*:(r) for all r © Fe. Theorem 1(e) shows 
that 


J. {ce(r, t) — [fi*(r) —fi* golds) = 0 (s = (7, t)). 
Thus f*,(r) — f*:(t) = c(r, t) except on sets of go-measure zero. This shows 
that the condition is necessary. The proof that the condition is sufficient is 
straightforward. 

For other problems which can be expressed as programmes in paired spaces 
the reader may refer to (1, chapters 4, 5, 6, 7, 11, and 12; 2, Chapters 6 and 7; 


8; 10; 15; 21, pp. 105-126; 22, chapter 2; 24). 


REFERENCES 


1. K. J. Arrow, S. Karlin, and H. Scarf, Studies in the mathematical theory of inventory and 
production (Stanford, 1958). 

2. R. Bellman, Dynamic programming (Princeton, 1957). 

3. N. Bourbaki, Espaces vectoriels topologiques (Eléments de Mathématique, XV, Premiére 
Partie (Paris, 1953]), V, chapters I and II. 

4. ———  Espaces vectoriels topologiques (Eléments de Mathématique, XVIII, Premitre 
Partie [Paris, 1955), V, chapters III, IV, and V. 

5. D. Bratton, The duality theorem in linear programming, Cowles Commission Discussion 
Papers, Mathematics No. 427 (1955). 

6. A. Charnes, W. W. Cooper, and A. Henderson, [ntroduction to linear programming (New 
York, 1953). 

7. H. Chernoff and S. Reiter, Selection of a distribution function to minimize an expectation 
subject to side conditions, Technical Report No. 23, Applied Mathematics and Statistics 
Laboratory, Stanford University (1954). 

8. G. B. Dantzig, The programming of interdependent activities: mathematical model, in (20) 
19-32. 

9. G. B. Dantzig, Maximization of a linear function of variables subject to linear inequalities, 
in (20) 339-347. 

10. R. J. Duffin, Infinite programs, in (23) 157-170. 

11. N. Dunford and J. T. Schwartz, Linear operators, Part I: General theory (New York, 
1958). 

12. D. Gale, H. W. Kuhn, and A. W. Tucker, Linear programming and theory of games, in 
(20) 317-329. 

13. Saul I. Gass, Linear programming: methods and applications (New York, 1958) 

14. A. Haar, Ueber lineare Ungleichungen, Acta Math. (Szeged), 2 (1924-26), 1-14. 

15. L. Hurwicz, Programming in linear spaces in K. J. Arrow, L. Hurwicz, and H. Uzawa, 
Studies in linear and non-linear programming (Stanford, 1958), 38-102 

16. S. Isaacson and H. Rubin, On minimizing an expectation subject to certain side conditions, 
Technical Report No. 25, Applied Mathematics and Statistics Laboratory, Stanford 
University (1954). 

17. University of Kansas, Linear topological spaces (1953) 








238 


19. 
20. 
21. 
22. 


23. 


24. 





K. S. KRETSCHMER 


L. Kantorovitch, On the translocation of masses, Comptes Rendus (Doklady) de l’Académie 
des Sciences de l'URSS, XX XVII (1942), 7-8. Reproduced in Management Science 
& (1958), 1-4. 

S. H. Khamis, On the reduced moment problem, Ann. Math. Stat., 22 (1951), 532-536. 

T. C. Koopmans (ed.), Activity analysis of production and allocation (New York, 1951). 

T. C. Koopmans, Three essays on the state of economic science (New York, 1957). 

K. S. Kretschmer, Linear programming in locally convex spaces and its use in analysis 
(Dissertation, Carnegie Institute of Technology, June, 1958). 

H. W. Kuhn and A. W. Tucker (eds.), Linear inequalities and related systems (Princeton, 
1956). 


R. Sherman Lehman, On the continuous simplex method, The Rand Corporation, Research 
Memorandum RM-1386 (1954). 








on, 


rch 








WIDTHS AND HEIGHTS OF (0,1)-MATRICES 
D. R. FULKERSON anp H. J. RYSER 


Introduction. A number of combinatorial problems may be regarded as 
particular instances of the following rather general situation. Given a set X 


composed of m elements x1, X2,...,Xn,, and m subsets X,, Xo,...,X,» of X, 
find a minimal system of representatives for X,, Xo, ..., Xm. That is, single 
out a subset X* of X such that X,(\ X* is non-empty for i = 1,2,..., m, 


and no subset of X containing fewer elements than X* has this property. To 
illustrate, each of the following can be thought of in these terms. 

(a) Find the fewest number of nodes that touch all arcs in a linear graph. 
Thus the sets X;, Xo,..., Xm are the arcs of the graph, each set consisting 
of two elements, its end nodes. A famous example of this is the eight queens 
chessboard problem. Here one forms a graph by connecting two cells of the 
board if a queen can move from one cell to the other. Then the complement 
of a minimal system of cells that touch all arcs represents positions in which 
the maximal number of queens can be placed so that no two attack each 
other. 

(b) Given two distinct nodes in a graph, find a set of arcs, minimal in 
number, that cut all chains leading from one node to the other. Here the 
elements x1, X2, ...,X, are the arcs of the graph, and the sets X,, Xo,.., Xm 
are all chains that join the two given nodes. A similar problem is to find the 
fewest numbe: of arcs that cut all directed cycles in a directed graph. 

(c) Given the truth table for a proposition letter formula F in r proposition 
letters p1, Po, ..., Pr, find a disjunctive normal form of F that has the fewest 
number of terms. That this problem, which arises, for example, in the design 
of switching circuits, falls in the category of minimal set representative 
problems, can be seen as follows: As elements of the fundamental set X, admit 
all terms having one of the forms q;, 9s, 9/4 -- ++ 192-++Qny» Where gq, 
is either p, or its negation j,, and such that the term takes the value ¢ (true) 
only if F(p1, p2,...,,r) does also, for all values of the proposition letters 
bi, D2, ..., Pr. In other words, a ¢ in the truth table for an admissible term 
implies a tin the same position for the F truth table. The subsets to be repre- 
sented are formed by grouping together, for each assignment of values to 
Pi, Po, ..., Pp, that makes F (py, po,...,,) true, all of the admissible terms 
that are also true for this assignment of values. For example, suppose 
that F(p;, p2, ps) is given by the truth table below (Table 1). 


Received February 15, 1960. Most of the results in this paper were obtained while the 
authors were participating in the IB M-sponsored Summer Institute on Combinatorial Prob- 
lems. The work of the second author was also supported in part by the Office of Ordnance 
Research. 


239 














240 D. R. FULKERSON AND H. J. RYSER 


TABLE | 

Pi p2 Ps F 
f f f f 

g f t f 

f t f t 

f t t f 
t f f t 

t f t t 

t t f t 

t t t f 


Then the elements of X are 


Pipo, Pips, Pops, PiP2Ps, PipoPs, Pipeops, Pipops, 


and the four subsets to be represented are 


Xi = {Pops, Pipops} 
X2 = (Pipe, Pips, Pipaps! 
Xs = {Pips, Pipops!} 
Xs = {Pips, Pops, PiP2ps}.- 
A minimal system of representatives is given by selecting the terms pipe, 
Peps, that is, 
F (pi, Po, Ps) = Pipe + Pops 


and F cannot be represented by a disjunctive normal form having fewer terms. 

Many other combinatorial problems can be viewed as minimal representative 
problems. (But doing so is unlikely to make the problem any easier.) Of the 
three listed above, only one, so far as we know, might properly be termed 
solved. This is the first problem mentioned under (b), for which the max flow 
min cut theorem provides a theoretical answer on the one hand, and on the 
other hand, an algorithm based on network flow considerations can be used 
to construct, in a highly efficient manner, a minimal cut set of arcs for any 
particular graph (2). For undirected graphs, the second problem under (b) 
is easy, the answer being the cyclomatic number of the graph, but for directed 
graphs, very little seems to be known. The problem in this latter form has 
been proposed by Moore (cf. 14). Berge (1) has obtained some results on 
problem (a), and Roth (9) has studied problems of type (c) using combina- 
torial topological methods. 

From the computational standpoint, any minimal set representative 
problem can be put in the form of an integer linear programme, for which 
Gomory (6) has devised promising algorithms. Thus, for example, we may 
take the constraint matrix A = [a;,| for the programme to be the incidence 
matrix of sets vs. elements, that is, a;; = 1 if x; is in X;, ay; = 0 otherwise. 
Then the minimal set representative problem asks for non-negative integers 











WIDTHS AND HEIGHTS OF (0.1)-MATRICES 241 


W1, W2,..., W, that minimize the linear form >) 2,w, over all selections of 
non-negative integers satisfying the constraints 


> aw, > 1, ¢=21,2,...,m™. 
j=l 


In general, however, the incidence matrix A is much too large to make such 
a computation feasible. Sometimes one can obtain other linear programmes 
that are not so formidable in size, and in certain cases, the programme may 
even be formulated so that optimal solutions are always integral. This is the 
situation, for example, in the first problem listed under (b), for which an 
appropriate formulation (not in terms of the incidence matrix A of chains vs. 
arcs) can be described that is both reasonable in size and automatically yields 
integer answers. 

The results of this paper are not aimed at a solution of the minimal set 
representative problem per se, but may be viewed as providing some informa- 
tion on this problem. Specifically, we are interested in obtaining bounds on 
the minimal number of representatives by allowing the incidence matrix to 
vary over all matrices of zeros and ones having the same row and column 
sums as the given A, that is, the class & generated by A (10). From this 
standpoint, the present paper may be regarded as a continuation of (4; 7; 
11; 12), in which other combinatorially significant quantities associated with 
an incidence matrix A have been so studied. 

In order to avoid repeating the cumbersome phrase “the number of 
elements in a minimal set of representatives for A,” we call this simply the 
“width” of A, or more precisely, the ‘‘l-width’’ of A, since we generalize the 
problem to a-widths, that is, we insist that each subset be represented at 
least a times. Throughout we let e(a) denote the a-width of a specified A; 
é(a) and é(a) then denote, respectively, the minimum and maximum a-widths 
taken over all A in &. The problem of determining é(a) in terms of the given 
row and column sums that characterize & is completely solved in the sequel, 
but our efforts to pin down é(a) have so far been unsuccessful.* In solving 
the é(a) problem, an auxiliary notion, the “a-height’’ of A, turns out to be 
important. This, and the other notions introduced informally above, will be 
defined more precisely in §1. 

Throughout the paper we use purely combinatorial methods in establishing 
results. It should be mentioned, however, that the formula obtained for é(a) 
can also be derived using network flows, and was in fact first obtained in this 
way. From the viewpoint of flow theory, the function \V(e, e ,f) introduced 
in § 4 can be interpreted as representing possible minimal cut capacities in 
an appropriate flow network. 


*Since the results of this paper were obtained, it has been shown by one of the authors 
that a solution to the @(1) problem would settle the existence question for finite projective 
planes. See (13). Thus the maximal width problem appears to be considerably deeper and 


more important than the minimal width problem 














242 D. R. FULKERSON AND H. J. RYSER 


1. Concepts and notation. Let A be a matrix of m rows and n columns 
and let each entry of A be 0 or 1. We call A a (0, 1)-matrix of size m by n. 
Let the sum of row i of A be denoted by r; and let the sum of column j of A 


be denoted by s,;. We call R = (ri, 7ro,...,1%m) the row sum vector and 
S = (51, So,...,5n) the column sum vector of A. The vectors R and S deter- 
mine a class 

(1.1) AW = A(R, S) 


consisting of all (0, 1)-matrices A of size m by n, with row sum vector R 
and column sum vector S. Simple necessary and sufficient conditions on R 
and S are available in order that the class & be non-empty (5; 10). Let A 
be in & and consider the 2 by 2 submatrices of A of the types 


l 0 0 l 
=|} 4 and A.=|' an 


An interchange is a transformation of the elements of A that changes a minor 
of type A, into Ag», or vice versa, and leaves all other elements of A unaltered. 
The interchange theorem (10) asserts that if A and A’ belong to YW, then A 
is transformable into A’ by interchanges. In our study we may suppose without 
loss of generality that % is non-empty and that 


(1.2) a TS ere Se 7S 
(1.3) oe. Se Fe | 


Such a class is called normalized. Henceforth we take M% normalized. 
Let a be an integer in the interval 


(1.4) O0< a <n, 


and let e« be an integer in the interval 
(1.5) lqeqn. 


Let A be a matrix in the normalized class &(R,S) and suppose A has an 
m by ¢e submatrix E* each of whose row sums is at least a. An integer a ful- 
filling these requirements is said to be compatible with « in A. 

Suppose now that a is positive and compatible with « in A. If this is the 
case, then we say that the « columns of our m by e submatrix E* of A form 
an a-set of representatives for the matrix A. Let e(a) be the minimal number 
of columns of A that form an a-set of representatives for A. Such a column 
set is called a minimal a-set of representatives for A and e(a) is called the 
a-width of A. The integer a and the matrix A uniquely determine e(a). We 
note at the outset that the a-width e(a) of A is invariant under arbitrary 
permutations of the rows and columns of A. However, the a-width of A’, 
the transpose of A, may differ drastically from that of A. 

Let E* be a submatrix of A of size m by e(a) that yields a minimal a-set 
of representatives for A. Let E be the submatrix of E* composed of all of 








WIDTHS AND HEIGHTS OF (0.1)-MATRICES 243 


the rows of E* that contain @ 1’s and e(a) — a 0's. The matrix E is called 
a critical a-submatrix of A. Note that E cannot be empty since if all row 
sums of E* exceed a, then deletion of any column of E* yields an a-set of 
representatives for A, contradicting the minimality of (a). 


THEOREM 1.1. The matrix A has an a-width ¢(a) for each a in the interval 


l1<a<tm. A critical a-submatrix E of A associated with an a-width (a) 
contains no zero columns. 


Proof. Suppose that a critical a-submatrix E of A associated with an 
a-width e(a) contains a zero column. Let E* be the m by e(a) submatrix of 
A containing E. The column of E* containing the 0 column of E may be deleted 
and this yields an m by 6(a) — 1 matrix with minimal row sum a. But this 
contradicts the minimality of €(a). 

Each of the critical a-submatrices E of A must contain e(a) columns. But 
the number of rows in the various critical a-submatrices need not be fixed. 
Let £ be a critical a-submatrix containing the minimal number of rows 4(a). 
The positive integer 6(a) is called the a-height of A. Both e(a) and 4(a) are 
basic invariants of the matrix A. Evidently 


(1.6) ott) < Ge <-s2 >. KS al 
and by Theorem 1.1, 
(1.7) 6(1) > e(1). 


Thus far we have discussed for the most part a specified matrix in the 
normalized class U%(R,S). We now turn our attention to properties of the 
class U(R, S). Let a and « be fixed and let a be compatible with «. This means 
that @ and e« are restricted by (1.4) and (1.5). Moreover, there exists an J 
in &(R, S) with an m by e submatrix E* whose minimal row sum is at least 
a. Now consider the class of all m by « submatrices E” of the matrices A in 
%(R, S) with the row sums of E” greater than or equal to a. Let 4” denote 
the number of row sums in E” equal to a. The non-negative integer 6 equal to 
the minimum of the integers 6” is called the multiplicity of a with respect to 
e. An a compatible with « may be of multiplicity 0 with respect to «. This 
will be the case whenever there exists an m by ¢ E” with all of its row sums 
greater than a. 

Let 1 < a < rm. Then each A in A(R, S) determines an a-width e(a) and 
a is compatible with e(a). For each a let the minimum of these e(a)’s over 


all A in A(R, S) be denoted by 
(1.8) é = é€(a). 


Then @ is compatible with é(a) and, by the minimality of é(a), if 8 > a, then 
8 is not compatible with é(a). We call € = €(a) the minimal a-width of the 
class A(R, S). Let 


(1.9) 5 = 5(a) 














244 D. R. FULKERSON AND H. J. RYSER 


denote the multiplicity of a with respect to é(a). The integer 5(a) is positive 
and is equal to the minimum of the 4(a)’s for all matrices Az in A(R, S) of 
a-width é(a). It is clear that 


(1.10) é(1) < (2) <... < Erm) 
and 
(1.11) 5(1) > «(1). 


Similarly for each a let the maximum of the e(a@)’s over all A in U(R, S) be 
denoted by 


(1.12) ¢é= aa). 


We call ¢ = é(a) the maximal a-width of the class A(R, S). A direct applica- 
tion of the interchange theorem allows us to prove that if ¢ is an integer in 
the interval 

(1.13) ees 


then there exists a matrix A, in U(R, S) of a-width € (see § 3). 

In § 2 we take an a compatible with ¢ and of multiplicity 6 with respect 
to e«. Under these conditions we establish the existence of a (0, 1)-matrix in 
%4(R, S) with an unusually simple block decomposition. An application of this 
theorem yields matrices of a-width é and a-height 6 in Y(R, S) called canonical 
matrices. Their study in §§ 3 and 4 leads to simple and explicit formulas for 
both é and 6. A straightforward construction for a canonical matrix is given 
in § 5. Section 6 concludes with applications to the special classes of (0, 1)- 
matrices containing k 1’s in each row or & 1's in each column. 


2. A block decomposition theorem. Let 0 < a <r, and let 1 < «<n. 
Let a be compatible with « and of multiplicity 6 with respect to e. We now 
prove the block decomposition theorem that plays a fundamental role in our 


subsequent investigations involving é and 6. 


THEOREM 2.1. Let a be compatible with « and of multiplicity 6 with respect 
to «. Then there exists a matrix A in the normalized class A(R, S) of the form 


M J * 
(2.1) A= F * 0 
E 


Here E is of size 6 by € with exactly a 1's in each row. M is a matrix of size 
e by « with a + 1 or more \'s in each row. F is a matrix of sizem — (e + 6) by « 
with exactly a + 1 1's in each row. J is a matrix of 1's of size e by f — « and 0 
is a zero matrix. The degenerate cases e = 0,e +6 = m,b = 0,f = €,andf =n 
are not excluded. 





itive 
>) of 


< n. 
now 
our 


spect 
form 





WIDTHS AND HEIGHTS OF (0,1)-MATRICES 245 


Proof. Let A be a matrix in the normalized class A(R, S) and let A contain 
a submatrix E* of size m by ¢ with 6 row sums equal to a and the remaining 
m — 6 row sums > a. Let m, m2,...,9. be the column vectors of E*. The 
matrix A is selected so that the vectors 7, 72,...,%,. are to the left as far 
as possible among all matrices A in & containing an m by ¢ submatrix E* of 
the type described. Let » be a column vector of A and suppose that 7 appears 
to the left of 4,, where 7, is one of 91, 92,...,7.. Now the class & is nor- 
malized, so the column sums of A are non-increasing. We apply interchanges 
involving only the two columns 9 and ,, and replace 9 by 7’, and n, by n/. 
The column 7’ is to have 1's in all of the positions in which 9, has 1's. These 
interchanges yield a new matrix A’ in &. Now columns 7’, m;,..., 9:~1, 
Ni+1,-++, Me Of A’ form an m by ¢€ submatrix of A’ with row sums > a. More- 
over, the number of row sums in this submatrix equal to a is < 6. Hence the 
matrix A may be selected so that the m by « submatrix E* is confined to the 
first « columns. 

If 6 = 0, then A is of form (2.1) with e = m, f = «. Let 6 be positive and 
suppose that in the first ¢ columns of A a row vector of E* of sum a occurs 
above a row vector of E* of sum > a. Since the row sums of A are non- 
increasing, we may apply interchanges to A and lower the row vector of E* 
of sum a. Hence we may obtain a matrix A in the normalized & with the 
submatrix E of (2.1) in the lower left corner. 

We now take this matrix and by interchanges obtain a matrix of the 
following form 





M, P Co 
(2.2) Fy W — xX 
E Y Z 0 


Here E is the matrix of (2.1). F; has exactly a + 1 1’s in each row and M, 
has a + 2 or more 1's in each row. J is a matrix of 1’s and Cp» has at least 
one 0 in each column. The matrix 0 in the lower right corner must be a zero 
matrix, since otherwise an interchange involving the blocks M,, Co, E, and 0 
contradicts the minimality of 6. (The tacit assumption that M, and Cy both 
appear is unimportant. For if this were not the case, (2.2) is already a degener- 
ate case of (2.1).) 

Now let Z be the zero matrix of ¢ columns that appears in all matrices of 
the form (2.2) in the normalized class 4%. The integer ¢ is to be maximal, but 
the case ¢t = 0 is not excluded. Then there exists a matrix of the form (2.2) 
with a 1 in the last column of Y. (Again if Y does not appear, then (2.2) is 
a degenerate case of (2.1).) Suppose that a 1 appears in row j of X and that 
a 0 appears in row j of W. We may apply an interchange if necessary and 
assume that a 0 appears in row j and the last column of W. Now an inter- 
change involving the 1 in row j of X and the 1 in the last column of Y places 











246 D. R. FULKERSON AND H. J. RYSER 


a 1 in 0 or in Z. This contradicts either the minimality of 6 or the presence 
of Z in all matrices of the form (2.2) in &. Thus if X contains a 1 in row j, 
then row j of W is a row of 1's. This means that there exists a matrix A in 
the normalized & of the form (2.1). 


3. The minimal a-width <a). 


THEOREM 3.1. Let € = €(a) be the minimal a-width of the normalized class 
W(R, S). Let § = 5(a) be the multiplicity of a with respect to @(a). Then there 
exists a matrix Az of a-width ¢ in A(R, S) of the form 


M J + 
(3.1) Aj; = F * 0 
E 


Here E is a critical submatrix of Az of size 6 by é. M is a matrix of size e by @ 
with a + 1 or more 1's in each row. F is a matrix of size m — (e + 8) by € with 
exactly a + 11's in each row. J is a matrix of size e by f — € consisting entirely 
of 1's, and 0 is a zero matrix. Each of the first ¢ columns of Az contains more 
than m — 6 1's. The degenerate cases e = 0, e + 6 = m, f = é, and f = n are 
not excluded. 


Proof. In Theorem 2.1 let « = €(a) and 6 = 6(a). Then (2.1) establishes the 
existence of a matrix A;z of the form (3.1). Note that in Theorem 3.1 the 
integers a, é, and 6 are positive and the degenerate case 6 = 0 of Theorem 2.! 
is excluded. The matrix A; is of a-width é(a). Each of the first é columns of 
Az contains more than m — 6 l’s. For if this were not the case we could 
apply interchanges confined to the first € columns of A; and replace a column 
of E by 0's. But this contradicts the minimality of €. 

The special case a = 1 of Theorem 3.1 deserves mention. A (0, 1)-matrix 
M is maximal (10) provided that in each row of M no 0 occurs to the left 
of a 1. We prove that for a = 1 the matrices M and F of (3.1) may be selected 
as maximal matrices. Let E* be the m by é(1) matrix composed of the first 
é(1) columns of (3.1). Let the sum of column 1 of E be e;. We minimize e, 
by applying interchanges to E*. This means that column 1 of M and column | 
of F must be columns of I's. We cannot have e; = 0, for this contradicts the 
minimality of €(1). Let the sum of column 2 of the transformed 6(1) by é(1) 
E matrix be ez. We minimize e2 by applying interchanges to the last é(1) — 1 
columns of the transformed £*. Thus column 2 of M and column 2 of F must 
be columns of 1's, and again e: > 0. But F contains only two I1’s in each 
row and hence F is the maximal matrix with exactly two 1’s in each row. Let 
the sum of column 3 of the transformed 6(1) by é(1) E matrix be e3. We mini- 
mize e; by applying interchanges to the last €(1) — 2 columns of the trans- 
formed E*, and continue this minimizing process until the matrix M is 
maximal. 


sence 
row j, 
A in 


class 
there 


e by € 
i with 
tirely 
more 
mn are 


s the 
l the 
n 2,1! 
ns of 
“ould 
lumn 


atrix 
> left 
»cted 
first 
Ze e) 
mn | 
s the 
é(1) 
— 1 
must 
each 
. Let 
nini- 
rans- 


UM is 


WIDTHS AND HEIGHTS OF (0,1)-MATRICES 247 


Theorem 3.1 is the basis for the simple formula for é(@) derived in § 4. 
Unfortunately the decomposition (3.1) does not have an apparent analogue 
for a matrix Az of maximal a-width ¢(a). Indeed, the class generated by the 
matrix 


tes 
10410 

(3.2 = 
8.2) ' 1100 
0001 


has «(1) = 3. Columns 1, 2, and 4 intersect a critical submatrix of A. But 
it is not possible to replace A by a matrix in its class with a critical submatrix 
in the first three columns. 

The following information on intermediate a-widths follows without diffi- 
culty. 


THEOREM 3.2. Jf € 1s an integer in the interval 
(3.3) ences 
then there exists an A, of a-width « in the normalized class XU. 


Proof. We show that a single interchange applied to a matrix A, of a-width 
e in & cannot raise the a-width by two or more. For consider the case in which 
a matrix A, of a-width ¢ is transformed by one interchange into a matrix A’ 
of a-width e + 2 or more. The matrix A, must have a critical submatrix E 
of size 6 by e. It is essential that the single interchange remove a | from the 
critical submatrix E, for otherwise we would have a matrix of a-width «or less. 
Let the column vectors 7, 92,..., 7. of A, intersect the critical submatrix 
E. The interchange affects two column vectors 7, and 7 of A,.. Here n,; is one 
of the vectors 71, 72,...,%. and 7 is some other column vector of A,. Let 
the interchange transform 7, into 9,/ and 7 into 7’. But now in A’ the e + 1 
columns 1, 72,...,%¢,.-++ 5% 7 are an a-set of representatives for A’. Hence 
one interchange can raise the a-width of A, by at most 1. But by the inter- 
change theorem we may transform by interchanges a matrix Ag of a-width 
é into a matrix Az of a-width ~. This establishes the existence of the matrix 
A, of a-width e. 


4. Canonical matrices. For the normalized class U(R, S) let 
(4.1) beg = Tear trazt..e- tia (sy + So +... + Sy) + ef. 
Here e and f are integer parameters such that 


(4.2 0<e<m, 
(4.3) O<f<n. 











248 D. R. FULKERSON AND H. J. RYSER 


Let A be in A(R, S) and suppose that 


W * 
(4.4) 4=| ‘ Z | 


with W of size e by f. For a (0, 1)-matrix Q let No(Q) denote the number 
of 0’s in Q and let N,(Q) denote the number of 1’s in Q. Then (4.1) can be 
rewritten in the form 


(4.5) ters = No(W) + Ni(Z). 


The invariants ¢,, of A(R, S) are useful in determining the maximal and 
minimal trace (12) and the maximal term rank (11) of the matrices in A(R, S). 

We now define invariants N(e, e,f) of UM(R,S) which are generalizations 
of (4.1). These invariants turn out to be effective in determining the minimal 
a-width of the matrices in &(R, S). Let 


(4.6) NV (e, e, f) = ress + rere +... +m — (Seta + See t+... + sy) +e(f—e). 


Here e«, e, f are integer parameters such that 


(4.7) O0Ocecn, 
(4.8) O<ce<c m, 
(4.9) e<f<¢n. 
Note that 

(4.10) N(O, e, f) = tes, 


and for « = 0, (4.9) reduces to (4.3). Moreover, (4.1) and (4.6) imply 


(4.11) N(e, e,f) = tes + (81 + 52 +... +5) — €. 
Let A be in A(R, S) and suppose that 
(4.12) A= | 5 ae | 
X * Z 2 
with X of size m — e by « and Y of size e by f — e. Then by (4.6), 
(4.13) N(e, e, f) = Ny(X) + No(¥) + ¥i(Z) 


Let A’ be a matrix in the normalized class of the forin 


M’ J * 
(4.14) A’ = F’ 
>| * 0 


e 


Here E’ is a matrix of size 4’ by «’ with exactly a I's in each row. M’ is a 
matrix of size e’ by «’ with a + 1 or more 1's in each row. F’ is a matrix of 
size m — (e’ + 8’) by ¢ with exactly a +1 1's in each row. J is a matrix 





er 











WIDTHS AND HEIGHTS OF (0,1)-MATRICES 249 


of 1's of size e’ by f’ — ¢ and 0 is a zero matrix. Each of the first «’ columns 
of A’ contains more than m — 3’ 1's. We require &’ and ¢«’ > 0 but the degener- 
ate cases e’ = 0, e' + 5’ = m, f’ = e, and f’ = n are not excluded. A matrix 
fulfilling these requirements is called canonical, and e’ and f' are said to be 
decomposition numbers for A’. The decomposition numbers for a specified A’ 
need not be unique. 

It is clear that the matrix A; of Theorem 3.1 is canonical with ¢’ = é, 
6’ = 6. The e and f of Theorem 3.1 are decomposition numbers for A;. 


THEOREM 4.1. The ¢' of the canonical matrix A’ of (4.14) equals the first 
non-negative integer « such that 


(4.15) N(e, e, f) > a(m — e) 


for all integer parameters e and f restricted by (4.8) and (4.9). 


Proof. Let ¢ be fixed and restricted by (4.7) and suppose that for some e 
and f restricted by (4.8) and (4.9) 


(4.16) Nie, e,f) <a(m — e). 
Then 
(4.17) ot 4. 


For suppose that (4.16) holds and that « > ¢«’. Then the first ¢ columns of 
A’ contain at least a 1’s in each row. But then by (4.13) the e, e, and f of (4.16) 
satisfy V(e, e,f) > a(m — e) and this contradicts (4.16). Hence (4.16) implies 
(4.17). 

Let « be fixed and restricted by (4.7) and suppose that for each e and f 
restricted by (4.8) and (4.9) 


(4.18) N(e,e,f) > a(m — e). 
Then 
(4.19) é<e 


For suppose that (4.18) holds and that « < ¢’. Then for the decomposition 
numbers e’ and f’ of (4.14) 


(4.20) O<e <m, 
and 
(4.21) exe <f' <n. 


By (4.13), 


(4.22) Nie, e, f’') = N(e, ef’) + No(T) — N,(U), 
where T denotes the submatrix formed by the intersection of rows 1, 2,..., e’ 
and columns «e+ 1l,e+2,...,¢€ of A’, and U the intersection of rows 


e +1,e +2,...,m and columns e+ 1,¢ + 2,...,€ of A’. Now each of 


cf 


the first «’ columns of A’ contains more than m — 6’ 1’s. Hence 











250 D. R. FULKERSON AND H. J. RYSER 


(4.23) N\(U) — No(T) + e'(ek — ©€) = Sait Saat... $ Se 
> (m — &)(e' — ©), 
and 
(4.24) N,(U) —= N,(T) >m— (e’ + 5’). | 
Moreover, 
| 
(4.25) N(e",e7,f')) = (a+ 1) (m—e’) -— 9%. 


Hence by (4.22), (4.25), and (4.24), 


(4.26) N(e,¢,f’) = (a+1)(m — ¢) — 8 + N,(T) — N,(U) 
< (a + 1)(m — e’) — 5 —mt+e4+ 8 =alm—e’). 


But this contradicts (4.18). Hence (4.18) implies (4.19) and this proves 
Theorem 4.1. 


THEOREM 4.2. Let € be the minimal a-width of the normalized class A(R, S). 
The ¢' of the canonical matrix A’ of (4.14) equals ¢ and ¢ is the first non-negative | 
integer « such that 
(4.27) N(e, e,f) > a(m — e) 
for all integer parameters e and f restricted by (4.8) and (4.9). 


Proof. This follows from Theorem 4.1 and the fact that the matrix A; of 
Theorem 3.1 is canonical. 

Theorem 4.2 provides a simple computation for € One can successively 
calculate the arrays N(é,e,f) + ae, N(é+1,e,f) + ae,..., each for / 
appropriate e and f, where ¢ is the first « such that s; + ss +... + 5, > am, 
stopping when all entries of the array are at least equal to am. The starting 
value é in the calculation is clearly a lower bound for é. 

The next theorem shows that all pairs of decomposition numbers e’, f’ can 
be singled out from the array NV(é — 1, e, f) + ae. 


THEOREM 4.3. Let A’ be the canonical matrix of (4.14) with & = @ Let 
(4.28) 7 = min [N(é — 1, e, f) + ae], 
e.J 
where0 ce < mandée—1<f <n. Then 
(4.29) 7¥ = N(e—1,e',f’) + ae’ 
if and only if e’ and f' are decomposition numbers for A’. 


, 


Proof. Let e’ and f’ be decomposition numbers for A’. Then 0 < e’ < 
and €< f’ < n. We consider first the case in which e < e’ andée<f< 


Then 


1. 


(4.30) N(é— 1,e’,f’) = Ne — lle, f) + No(T) — Ni(U) — No(V) 
— N,(W). 


€), 


S). 


tive 


an 


WIDTHS AND HEIGHTS OF (0,1)-MATRICES 251 


Here TJ is the intersection of rows e + 1,e + 2,...,e’ and column é of A’, 
U is the intersection of rows e + 1,e + 2,..., e’ and columns 1, 2,..., é—1 
of A’, V is the intersection of rows 1, 2,..., eand columns é + 1, € + 2,...,f 
of A’, and W is the intersection of rows e + 1,e¢ + 2,...,m and columns 
f+1,f+2,...,2 of A’. Now since e’ and f’ are decomposition numbers 
for A’, 

(4.31) N,(U) aa [e’ —-e= No(T)] = (a+ l)(e’ — e) + p, 

where p is a non-negative integer. Hence 

(4.32) No(T) -_ Ni(U) =ale-— e’) — p 

and 


(4.338) N(é—1,e,f’) +ae’ = N(éE— le, f) + ae — p — No(V) — Ni(W). 
Thus 
(4.34) N(é — 1, e’, f’) + ae’ < N(é— l,e, f) + ae 


and equality holds if and only if p = 0, No(V) = 0, Vi(W) = 0. But p = 0, 
No(V) = 0, Ni(W) = 0 if and only if e and f are decomposition numbers 
for A’. 

Next consider the case in which e’ < e and é€ < f < n. Then 


(4.35) N(é— 1,e’,f’) = N(é— le, f) + Ni(U) — No(T) — No(V) — Ni(W). 


Here 7 is the intersection of rows e’ + 1, e’ + 2,..., e and column é of 4’, 
U is the intersection of rows e’ + 1, e’ Anne e and columns 1, 2,...,é€—1 
of A’, V is the intersection of rows 1, 2,...,eandcolumnsé + 1, €+ 2,...,/ 
of A’, and W is the intersection of rows e + 1,e + 2,..., m and columns 


f+1,f+2,...,2 of A’. Now 


(4.36) N,(U) + [e — e’ — No(T)] = ale — e’) +9, 
where g is a non-negative integer satisfying 

(4.37) qt+eé—e< 0. 

Hence 


(4.38) N(é— le’, f’) + ae’ = N(E— lle, f) t+ae+e’ —e+q 
— No(V) — Vy(W). 


Thus 
(4.39) N(é— 1, e’, f’) + ae’ < N(é— lle, f) + ae 
and equality holds if and only if g = e — e’ and No(V) = N,(W) = 0, that 
is, if and only if e and f are decomposition numbers for A’. 
We now extend the range of f to € — 1 < f < n. It suffices to show that 


if f = €— 1, then 
N(é— l,e,f) tae> N(éE— 1, ec’, f’) + ae’. 


But this follows without difficulty from the equations 











252 D. R. FULKERSON AND H. J. RYSER 
N(é—1e,€—1) = rear trae t... +m 
and 
N(é— 1,e,f’) = (a+ 1)(m— &) —8 +e — 52. 
This completes the proof of Theorem 4.3. 


THEOREM 4.4. Let 6 be the multiplicity of a with respect to é. The &' of the 
canonical matrix A’ of (4.14) equals 5 and 


(4.40) § = (a+ 1)m—7F7— sz. 
Proof. Let A’ be the canonical matrix of (4.14). Then 

(4.41) Nii, ef’) = N(é—1.e.f') +82 —e. 

But 

(4.42) N(é, e’, f') = (a + 1)(m — e’) — 8 


and hence 
(4.43) f= (a+ 1)m—7F7— sz. 


Moreover, the matrix Az of Theorem 3.1 is canonical so that 8 = 6. 

We conclude this section with a numerical example illustrating the com- 
putation of €(1), 6(1), and the decomposition numbers e’, f’ for a normalized 
class. Let M(R, S) be determined by 

R = (6, 5, 3, 2, 2, 2, 1, 1), 
S 


(4, 4, 4, 4, 4, 1, 1). 


The arrays N(2,e,f) +e, for 0 ce <8, 2<f <7, and N(3,e,f) +e, for 
0<e< 8, 3 <f <7, yield all pertinent information. They are shown in 
Table II. 





TABLE II 
e=2 e=3 

a NS 

™~ § 3 4 5 6 7 e* 3 ' 5 6 7 
o 2 8 4 1 9 8 0 2 8 4 2B = 12 
2. Be 8 8 5 1 7 4 MW HH 
2 123 I 9 7 8 9 2 28 i 9 10 I 
ee | ee 9 8 1 2 Ye <2  ® @ 
4 © © 0 0 8 8 ¢ 0 0 0 28 16 
5 9 10 Wt 12 6 2 5 9 0 I 6b 
6 8 10 12 4 19 24 6 8 10 12 17 2 
7 s uM TW Bw 7 8 ll 4 0 26 





he 








WIDTHS AND HEIGHTS OF (0,1)-MATRICES 253 


The recursions 


(4.44) N(ege+1,f) = Nee f) —reat+f—-e, 
(4.45) N(e,e,f +1) = N(e,e,f) + e — Spa, 
(4.46) N(e + 1,e,f) = N(e,e,f) + sai —e 


are useful in constructing such arrays. 


Since 
N(2,2,5) +2 =7 <8, 
N(3,e,f) +e > 8, (0<e<8,3<f <7) 


’ 


we have é(1) = 3. Also 7 = 7 corresponding to the unique decomposition 


numbers e’ = 2, f’ = 5, and hence 6(1) = 5. A canonical matrix in the class 
is given by 








1 11 1 1 1 0 
1 10 1 1 0 1 
1 10 1 0 0 0 
a 100 1 0 0 0 
01 0 0 1 0 0 
001 0 1 0 0 
001 0 0 0 0 
001 0 0 0 0 


5. Construction of canonical matrices. We are now in a position to 
give a simple procedure for the construction of a canonical matrix A’. Before 


doing so we recall some facts about the construction of a (0, 1)-matrix of size 


m by n having a specified row sum vector R = (ri, 72,..., Ym) and column 
sum vector S = (s1, Se,...,5n) (5; 7; 10). Let R,; be a row vector of r; 1's 
and n — r; 0's. Let the 1’s be inserted in the positions in which S has its r; 
largest components. Let R: be a row vector of re 1’s and m — rz 0's. Let the 


1’s be inserted in the positions in which S — R;, has its rz largest components. 
R; is a row vector whose 1’s are in the positions in which S — R,; — R:z has 
its rz largest components, and so on. Now let A be a matrix with row sum vector 
R and column sum vector S. We may apply interchanges to A and replace 
row 1 of A by R,;. Then we apply interchanges to the transformed matrix 
and replace row 2 by Re. These interchanges do not involve R,. In this way 
we transform A by interchanges into a matrix A* composed of the row vectors 
R,, Ro, ..., Rm. But this tells us that A* has row sum vector R and column 
sum vector S, and hence we have a procedure for constructing a matrix in 
the class &(R, S). 

We now construct a canonical matrix A’ of the form (4,14) in the nor- 
malized class U(R,.S). The theorems of § 4 give formulas for the integers 











254 D. R. FULKERSON AND H. J. RYSER 


é = @ & = 45, e and f’ in terms of R and S. The submatrix of A’ formed 
by the intersection of rows e’ + 1,e° +2,...,m and columns ¢ +1, 
e' +2,...,f" has its row and column sum vectors determined. Hence this 
submatrix may be inserted. 


Let 
M' G’ 
(5.1) B=| Ff 
— 0 
E 


be the m by n — ({' — e’) submatrix of A’ formed from A’ by the deletion 
of columns ¢ + 1, é 2,...,f'. The matrix B comprises the undeter- 
mined portion of A’. We know the row sums of B, F’, and E’ and the column 


sums of B and G’. Let 
(5.2) B’ = [M’ G’] 
denote the first e’ rows of B and let 


(5.3) S’ = (Spas, Sr42,..-,8 


) 
n 


denote the column sum vector of G’. We apply interchanges to B’ and place 
the 1I’s in column 1 of G’ in those rows of B’ that possess the s,-4, largest 
row sums. Now we ignore column 1 of G’ of the transformed B’ matrix and 
apply interchanges to column 2 of G’. These interchanges do not disturb 
column 1 of G’ and they place the 1’s in column 2 of G’ in those rows of B’ 
that possess, with column 1 of G’ excluded, the s, 4» largest row sums. This 
gives a construction for G’. But then this determines a row sum vector for 
G’ and hence a row sum vector for M’. The construction for G’ is such that 
each of the components of the row sum vector of M’ exceed a. In fact 1's 
are inserted in the columns of G’ by a procedure that keeps the size of the 
row sums of M’ as uniform as possible. The undetermined portion of B now 
consists of the first «’ columns of B. But we know the row sum vector and 
column sum vector of this m by «’ matrix, and hence we have a construction 
for a canonical matrix A’ 


6. Special classes. Let %(K,S) denote the normalized class pf m by n 
(0, 1)-matrices having row sum vector K = (k,k,...,) and column sum 
vector S = (5, S2,..., Sn). Similarly, let A(R, K) denote the normalized class 
of m by n (0, 1)-matrices having row sum vector R = (rj, ro,...,1%m) and 
column sum vector K = (k.2,...,). For these special classes, the canonical 
form (4.14) is always degenerate. 


THEOREM 6.1. Every canonical matrix A’ of form (4.14) in A(K,S) has 
decomposition numbers e’ = 0, f’ = n. Every canonical matrix A’ of form (4.14) 
in U(R, K) has decomposition number f' = n. 














WIDTHS AND HEIGHTS OF (0,1)-MATRICES 2: 


~ 


Proof. Let A’ of form (4.14) be in the normalized class A(K, S), and suppose 
e’ > 0. Then, comparing first and last row sums of A’, we have a + 1 + f’ 


=< 


‘'<gk<atf’ — e. This contradiction shows that e’ = 0, and hence 


f' =n. 
Let A’ of form (4.14) be in the normalized class A(R, K), and suppose 


f’ <n. Comparing first and last column sums of A’ yields e > k > m — &, 


ef 


a contradiction. Thus f’ = n. 


For the class U(K, S), the lower bound for é mentioned following the proof 
of Theorem 4.2 is always achieved: é is the first « such that 


(6.1) Si + Sot... +S > aM. 


For in A’ of (4.14) with e = 0, 


(6.2) Sis t+seot...+s¢ = am + (m — 3). 


Hence s; + Sot... + SZ-1 <a@m and s; + So +...+57 > am. Moreover, 
5 for the normalized class A(K, S) is given by 


(6.3) 6 = (a + 1)m — (53 + 52 +... + 57). 


REFERENCES 


C. Berge, Two theorems in graph theory, Proc. Nat. Acad. Sci., 43 (1957), 842-844 


. L. R. Ford, Jr. and D. R. Fulkerson, A simple algorithm for finding maximal network flows 


and an application to the Hitchcock problem, Can. J. Math., 9 (1957), 210-218 

D. R. Fulkerson, A network flow feasibility theorem and combinatorial applications, Can. J 
Math., 72 (1959), 440-451. 

Zero-one matrices with zero trace, Pac. J. Math., 10 (196)), 831-836 

D. Gale, A theorem on flows in networks, Pac. J. Math., 7 (1957), 1073-1082 

R. Gomory, All-integer integer programming algorithm, 1.B.M. Report RC-189 (1960), to 
appear in a special issue of the 1.B.M. Journal. 

R. M. Haber, Term rank of 0,1 matrices, Rend. Sem. Mat. Padova, 30 (1960), 24-51 


. A. J. Hoffman, Some recent applications of the theory of linear inequalities to extremal com- 


binatorial analysis, Proceedings of Symposia in Applied Mathematics, Amer. Math. Soe 
10 (1960) 113-128. 


. J. P. Roth, Combinatorial topological methods in the synthesis of switching circuits, 1.B.M 


Report RC-11 (1957). 


10. H. J. Ryser, Combinatorial properties of matrices of zeros and ones, Can. J. Math., 9 (1957), 
371-377. 

i. - The term rank of a matrix, Can. J. Math., 10 (1958), 57-65 

12. Traces of matrices of zeros and ones, Can. J. Math., 12 (1960), 463-476 

13. Matrices of zeros and ones, Bull. Amer. Math. Soc., 66 (1960), 442-464 

14. A. W. Tucker, On directed graphs and integer programs, to appear in a special issue of the 


1.B.M. Journal. 


The RAND Corporation 
and 
Ohio State University 











SUBLATTICES OF A FREE LATTICE 
BJARNI JONSSON 


Introduction. Professor R. A. Dean has proved (1, Theorem 3) that a 
completely free lattice generated by a countable partially ordered set is 
isomorphic to a sublattice of a free lattice. In particular, it follows that a 
free product of countably many countable chains can be isomorphically 
embedded in a free lattice. Generalizing this we show (2.1) that the class of 
all lattices that can be isomorphically embedded in free lattices is closed 
under the operation of forming free lattice-products with arbitrarily many 
factors. We also prove (2.4) that this class is closed under the operation of 
forming simply ordered sums with denumerably many summands. Finally 
we show (2.7) that every finite dimensional sublattice of a free lattice is finite. 

The first theorem mentioned above is based on a result (1.3) of a rather 
general nature concerning free products of algebraic systems. It is perhaps 
worth noting that the amalgamation property considered there has played a 
role in the investigations of other embedding problems in universal algebra. 
Compare in this connection Fraissé (2) and Jénsson (3). 


1. A theorem in universal algebra. We consider a class K of algebras, 
or algebraic systems, U = (A, Fo, F1,..., Fe, ..-)eca, Where A is a non-empty 
set, a is a finite or infinite ordinal and, for each — < a, F; is an operation of 
some finite rank yu; over A, that is, F: is a function on A“ to A. The ordinal a 
and the natural numbers yw: are assumed to be the same for all members 
of K. Actually we shall identify the algebra & with its underlying set 4. 
Assuming the notions of isomorphism, homomorphism, and subalgebra to 
be known, we recall the definitions of a free K-algebra and of a free K-product. 

We say that A is a free K-algebra generated by X if and only if A € K, 
X CA, A is generated by X, and for any mapping f of X into an algebra 
B € K there exists a homomorphism g of A into B such that g(x) = f(x) for 
all x € X. 

We say that A is a free K-algebra if and only if there exists a set X such 
that A is a free K-algebra generated by X. 

We say that A is a free K-product of A,, i € I, if and only if A € K, J is 
a non-empty set, A, © K and A, is a subalgebra of A for all i € J, A is 
generated by the set ;,.,;A,;, and for any homomorphisms f/f, of the algebras 
A, into an algebra B € K there exists a homomorphism g of A into B such 
that g(x) = f;(x) whenever 7 € J and x € A,. 





Received December 15, 1959. These investigations were supported by a grant from the 
National Science Foundation. 


256 














SUBLATTICES OF A FREE LATTICE 257 


(The free K-product just defined might be properly called an inner free 
K-product. An outer free K-product of algebras B, € K, i € J, would consist 
of an algebra A € K together with isomorphisms of the algebras B, into A 
such that A is an inner free K-product of the images A , of the algebras B,.) 

To insure the existence of free K-algebras and of free K-products we shall 
assume that K is a non-trivial equational class in the wider sense, that is, K 
is the class of all models of some finite or infinite set of equations, and there 
exists an algebra A € K having at least two elements. We also assume that 
K has the following property: 


Definition 1.1. A class K of algebraic systems is said to have the embedding 
property if and only if for any A, B € K there exists C € K such that 4 and 
B are isomorphic to subsystems of C. 


These assumptions are somewhat stronger than is necessary, but they are 
sufficiently general for the present purpose. 

Under these assumptions concerning K we have: 

For any non-empty set X there exists an algebra A such that A is a free 
K-algebra generated by X. 

Suppose A is a free K-algebra generated by X, and f is a one-to-one mapping 
of X onto a subset Y of an algebra B. Then B is a free K-algebra generated 
by Y if and only if there exists an isomorphism g of A onto B such that 
g(x) = f(x) for all x € X. 

If J is a non-empty set and if B, € K for each i € J, then there exist an 
algebra A and isomorphisms of all the algebras B,; onto subalgebras 4, of A 
1E I. 


Suppose .1 is a free K-product of A,,i € J, and for each i € J suppose 


such that A is a free K-product of A,, 
is an isomorphism of A; onto a subalgebra B, of an algebra B. Then B is a 
free K-product of B,,i € J, if and only if there exists an isomorphism g of 
A onto B such that g(x) = f;(x) whenever 7 I and x € A, 

Suppose with the elements 7 of a non-empty set J there are associated 
pairwise disjoint, non-empty subsets X,; of an algebra A, let X erX 
and for each i € J let A, be the subalgebra of A, which is generated by X,. 
Then A isa free K-algebra generated by X if and only if A is a free K-product 
of A,,2 € I, and for each i € J, A; is a free K-algebra generated by X 

We now introduce the amalgamation property mentioned in the intro- 
duction. 


Definition 1.2. A class K of algebraic systems is said to have the amalgamation 
property if and only if the following condit‘ons are satisfied: 

If A, Bo, B, © K and if fo and f,; are isomorphisms of A into By and into 
B,, respectively, then there exist C € K and isomorphisms go and g; of Bo 
and of B,, respectively, into C, such that gofo(x) = gifi(x) whenever x € A. 


THEOREM 1.3. Suppose the class K of algebraic systems is non-trivial and 
equational in the wider sense, and assume that K has the embedding property 











258 BJARNI JONSSON 


and the amalgamation property. If A is a free K-product of A,, i © I, if B, is 
a subalgebra of A, for each i € I, and if B is the subalgebra of A that is generated 
by the set \U ,«7B;, then B is a free K-product of B,, i € I. 


Proof. For convenience we assume that J is the set of all ordinals & < a, 
where a is some fixed ordinal. There exist C © K and isomorphisms f/f; of B; 
onto subalgebras C; of C for all — < a, such that C is a free K-product of 
C:, — <a. Consequently there exists a homomorphism / of C irito B such 
that h(x) = f:-'(x) whenever ¢ < a and x € C;. Since B is generated by the 
union of the algebras B;:, h must actually map C onto B. The proof will 
therefore be complete if we prove that h/ is one-to-one. 


We shall show that there exists an increasing sequence of algebras Dy = C, 
D,, De, ..., Da in K and a sequence of functions go, gi, ..., ae a E<a, 


E <a: 


such that the following conditions hold for each ~ 


(1) ge maps A; isomorphically into D¢,;. 
(2) ge(y) = fe(y) whenever y € Bz. 


In fact, suppose 0 < A < a, and suppose D; has been defined for all — < A, 
and g: has been defined for all with € + 1 < A in such a way that (1) and 
(2) hold whenever — + 1 < X. If A is a limit ordinal, then we let D, be the 
union of the algebras D, with » < \’. Thus D; is defined for all § << A +1. 
Since the conditions § + 1<A+1 and §+1 <A are equivalent, we see 
that g; is defined and that (1) and (2) hold whenever é + 1 < A + 1. If A is 
not a limit ordinal, say 4 = » + 1, then f, maps B, isomorphically into 
D, (because C is a subalgebra of D,), and the identity automorphism of B, 
maps B, isomorphically into A,. By the amalgamation property this implies 
that there exist D, € K, an isomorphism g, of A, into D,, and an isomor- 
phism &, of D, into D, such that g,(x) = &,f,(x) for all x € B,. We may 
assume that D, is an extension of D,, and that &, is the identity automor- 
phism of D,, so that g,(x) = f,(x) for all x € B,. Thus D; has been selected 
for all & < \} + 1 and g; has been selected for all with € + 1 <A +1, and 
the conditions (1) and (2) hold whenever + 1 < A + 1. An easy induction 
now establishes the existence of all the required algebras D; and functions g:. 

Each of the algebras D;,; with § < a@ is a subalgebra of D,, and therefore 
ge maps A; isomorphically into D,. Since A is a free K-product of A;, — < a, 
it follows that there exists a homomorphism g of A into D, such that g(y) = g:(y) 
whenever § < a and y € Az. In particular, if x € C;:, then y = f;-“'(x) = A(x) 
belongs to B;, and therefore g(y) = ge(y) = fe(y) = x. Thus gh(x) =x 
whenever x belongs to one of the algebras C;:, whence it follows that gh(x) 
for all x € C. This shows that h is one-to-one, and the proof is complete. 


II 


x 


COROLLARY 1.4. Suppose the class K of algebraic systems is non-trivial and 
is equational in the wider sense, and assume that K has the embedding property 
and the amalgamation property. If A is a free K-product of A;, i © I, and tf for 














SUBLATTICES OF A FREE LATTICE 259 


each i € I, A, is isomorphic to a subalgebra of a free K-algebra with m, genera- 
tors, then A is isomorphic to a subalgebra of a free K-algebra with 

> m, 

el 


generators. 


Proof. Let 


m = 7 Mi, 


el 


and let F be a free K-algebra generated by a set X with m elements. Then 


X=UX, 
iel 
where the sets X, are pairwise disjoint and X, has m, elements. If for each 
i€ I, F, is the subalgebra of F that is generated by X,, then F is a free 
K-product of F,, i © J. Furthermore, F; is a free K-algebra generated by X,, 
whence it follows that A, is isomorphic to a subalgebra B, of F;. If B is the 
subalgebra of F that is generated by the set 


U By 


then we infer by (1.3) that B is a free K-product of B,, i © J. Consequently 
A is isomorphic to B. 

An example shows that the amalgamation property cannot be dropped 
from the hypothesis of (1.3) and (1.4). Let K be the class of all systems 
consisting of a group G together with a homomorphism a of G into its centre. 
That is, in addition to the group axioms the systems in K satisfy the con- 
ditions 


a\xy) = a\xjaly), a\x)y ya\x). 


Regarded as a group, a free K-system F generated by a set X is a direct 
product of a free group F» generated by X and a free Abelian group F, gener- 
ated by the set of all elements of the form a*(x) with x © X andk = 1, 2, 3,.... 
Since two elements of a free group commute if and only if they are powers 
of the same element, it follows that two elements a and 6 of F commute if 
and only if a = wv and 6 = u*w where u € Fo, v, w © Fi, and p and g are 
integers. From this we infer that every free K-system F has the following 
property: If a,6 € F, and ab = ba, then there exist integers p and gq, not 
both zero, such that a%)~” is in the centre of F. It obviously follows that 
every subsystem of a free system also has this property. 

Let G be a free K-product of Go and G,, where G, is a free K-system gener- 
ated by a one-element set {x} and G, is a free Abelian group generated by 
an infinite set {yo, 20, ¥1, 21,...} together with the endomorphism a that 
takes y,; into yi, and 2, into 2,4;. Letting Ao, Bo, A1, and B, be the sub- 
groups of G generated by the sets {x}, {a*(x)|k = 1,2,...}, {yo, zo}, and 
{ v1, 21, 2, Z2,...}, respectively, and using * and X to denote free group- 











260 BJARNI JONSSON 


products and direct products, we find that Gp = Ao X Bo and G; = A; X By, 
and therefore G = (Ao*#A,) X Bo X B,. It follows that no element of A; 
except the identity belongs to the centre of G, because no other element 
commutes with x. Consequently yo'zo-”? does not belong to the centre of G 
unless p = g = 0, and since yo% = 20¥o this shows that G is not isomorphic 
to a subsystem of a free K-system. Inasmuch as Gp is a free K-system and G, 
is isomorphic to the centre of a free K-system, it follows that the conclusion 
of (1.4) fails for the class K. 

Observing that in the proof of (1.4) the only use made of the amalgamation 
property was through (1.3), we infer that the above class K must also violate 
the conclusion of (1.3). An even simpler example of a non-trivial equational 
class, having the embedding property but not satisfying the conclusion of 
(1.3), is the class K of all groups G such that a*b? = 6a? for all a, 6 € G. 
Since there exist non-Abelian groups having this property (for example, the 
group of all permutations of a three-element set), a free K-algebra with two 
generators or, equivalently, a free K-product of two infinite cyclic groups, 
cannot be Abelian. If A is a free K-algebra generated by a two-element set 
{a@», @;}, and therefore the free K-product of the infinite cyclic groups A» and 
A, generated by the sets {ao} and {a,}, then the subgroup B of A generated 
by {@o*, @:7} is Abelian, and is therefore not a free K-product of the subgroups 
generated by {ao”} and {a,’}. 


2. Sublattices of a free lattice. We begin by applying the results of 
the preceding section to lattices. The class K of all lattices is non-trivial and 
equational. The direct product of two lattices is therefore a lattice, and since 
every lattice has a one-element sublattice it follows that K has the embedding 
property. In Jénsson (3), in the proof of Theorem 3.5, it is shown that K 
has an amalgamation property that is stronger than the one considered 
here. Using (1.4) we therefore obtain: 


THEOREM 2.1. If A 1s a free lattice-product of A,, i © I, and if for each i € TI, 
A, is tsomorphic to a sublattice of a free lattice with m, generators, then A is 
isomorphic to a sublattice of a free lattice with 


} mM; 
iez 


generators. 


COROLLARY 2.2. If A is a free lattice-product of A;, i € I, and if each A, 
is a denumerable chain, then A is isomorphic to a sublattice of a free lattice 
with m generators, where m is the cardinal of I in case I is non-denumerable, 
and m = 3 in case I is denumerable. 


LEMMA 2.3. Suppose m is an infinite cardinal and F is a free lattice with m 
generators. If a,b € F and b <a, then the lattice quotient a/b contains, as a 
sublattice, a free lattice with m generators. 














SUBLATTICES OF A FREE LATTICE 261 


Proof. If X is the set of generators of F, then there exists a finite subset Y 
of X such that a and 6 belong to the sublattice F’ of X, which is generated 


by Y. The sublattice F” of F, which is generated by the set Z = X — Y, 
can be mapped homomorphically into a/b by a function f such that 
f(x) = 6 + ax whenever x € Z. We shall show that f is an isomorphism. 


Since F is a free lattice-product of F’ and F”, it follows from 1.3 that the 
lattice D generated by a, 6, and Z is a free lattice-product of F’’ and of the 
two-element lattice E = {a, 6}. Letting F be a lattice obtained by adjoining 
a zero element 0 and a unit element 1 to F’ we map E and F” into F by 
mapping @ into 1, d into 0, and each element of F” into itself. These iso- 
morphisms have a common extension g which is a homomorphism of D into 
F. Since f is a homomorphism of F” into D, gf is a homomorphism of F” into 
F. Furthermore, gf(x) = g(b + ax) = 0 + 1x = x for all x € Z, and there- 
fore gf(x) = x for all x € F’. Consequently f is an isomorphism of F” into 
a/b, and the proof is complete. 

Given two non-empty subsets B and C of a partially ordered set A, we shall 
write B < C if and only if either B = C or else 6 < ¢ for all 6 € B and 
c € C. It is obvious that the non-empty subsets of a partially ordered set 
form, under this relation, another partially ordered set. 


THEOREM 2.4. If m > 3, and if the lattice A is the union of a denumerable 
chain % of sublattices each of which is isomorphic to a sublattice of a free lattice 
with m generators, then A is isomorphic to a sublattice of a free lattice with m 
generators. 


Proof. Since a free lattice with three generators contains as a sublattice 
a free lattice with infinitely many generators, we may assume that m is 
infinite. 

Let F be a free lattice with m generators and let # be the family of all 
quotients a/b with a,b € F and 6 <a. Then @& is a partially ordered set. 
Furthermore, for any two quotients a/b and c/d in F, if a/b < c/d, that 
is, if b<a<d<ce, then it follows by 2.3 that there exist x, y € F such 
that a < y < x <d and therefore 


a/b < x/y < c/d. 


Consequently, if @ is a maximal chain in 4, then © is dense-in-itself, and 
Sf is therefore order-isomorphic to a subchain ©’ of @. By (2.3) and the 
hypothesis, each of the lattices B € VW is isomorphic to a sublattice B’ of 
the corresponding quotient in @’, and we conclude that the union A’ of these 
lattices B’ is a sublattice of F, and that A is isomorphic to A’. 


THEOREM 2.5. Suppose A 1s a lattice with a zero element 0 and a unit element 
1, and assume that B and C are sublattices of A such that 


B(\C=¢andBUC=A— {0,1}, 
b +c=1 and bc = 0 whenever b € B and c G. 











262 BJARNI JONSSON 


If B and C are isomorphic to sublattices of a free lattice with m generators, where 
m > 3, then so is A. 


Proof. We may assume that m is infinite. If F is a free lattice with m 
generators, then F is not modular, and hence there exist a, b,c € F such that 


(1) ac<b<a<bte. 


We can further assume that a is additively irreducible. In fact, by’ (2.3) the 
lattice quotient a/b contains as a sublattice a free lattice F’ with m generators, 
and F’ contains an element a’ that is multiplicatively reducible (in F’ and 
therefore also in F), and consequently a’ is additively irreducible. Further- 
more, 6 < a’ <a and hence a’c < ac < 6. Thus (1) holds with a replaced 
by a’. We henceforth assume that a is additively irreducible. 

Again using (2.3), we select an additively irreducible element d € F with 
ec <d<6+¢, and we show that 


(2) ad<b+ad<a<b-+e, ad <ct+ad<d<b+e. 


Since c < d < 6 + ¢, it follows that 6 { d and therefore ad < 6 + ad. Also 
6 + ad < a, and an equality would imply that a = ad (because a is additively 
irreducible and 6 < a). From this we could infer that a < d, hence 6+ c¢ < 
a +c<d, contrary to our choice of d. Thus 6 + ad < a. The inequality 
a < 6+ cis part of our hypothesis (1). Since 6 < a < 6 + c, we have c + a, 
hence ad <c-+ad. Furthermore c + ad <d, and equality is excluded 
because it would imply that d = ad, hence c < d < a. The last inequality 
in (2) holds because of our choice of d. 

By (2.3) the lattice quotients a/(6 + ad) and d/(c + ad) contain free 
lattices F,; and F, with m generators, and by hypothesis it follows that there 
exist functions f and g mapping B and C isomorphically into F,; and F2, 
respectively. Observing that x + y = 6+ c and xy = ad whenever x © F; 
and y € F:, we obtain the desired isomorphism h of A into F by letting 
h(x) = f(x) for all x € B, h(x) = g(x) for all x € C, h(O) = ad, and 
h(l) =b+.c. 

In proving our last result we need the observation that a free lattice, and 
hence every sublattice of a free lattice, satisfies a special case of the distri- 
butive law. 


LEMMA 2.6. Suppose F is a free lattice and u,a, b,c © F. 
(i) Jf u ab = ac, then u = a(b+c). 
(ii) Ifu=a+b=a-+ec, then u=a-+ be. 


Il 


Proof. By Whitman (4, Theorem 2, Corollary 2), the canonical representation 
of u, 


has the property that if 














SUBLATTICES OF A FREE LATTICE 263 


u=I[] »,, 
ym 
then each of the elements u, contains one of the elements v,. Under the 
hypothesis of (i) it follows that each of the elements u, either contains a or 
else contains both 6 and c, and in either case we therefore have a(b + c) < u,. 
Consequently a(b + c) < u. The opposite inclusion is obvious, and (ii) follows 
by duality. 


THEOREM 2.7. Every finite dimensional sublattice of a free lattice is finite. 


Proof. We shall actually prove the stronger statement that every finite 
dimensional lattice A which satisfies the condition (i) of (2.6) is finite. Assuming 
that this holds for all lower dimensional cases, consider the case when the 
dimension of A is n. 

Let M be the set of all the atoms of A, choose a € M,and let V = M — {a}. 
Then ab = 0 for all 6 € N, and letting 

c= x b 
dEeN 
we infer from (2.61) together with the finiteness of the dimension of 4 
that ac = 0. Therefore c # 1, and by the inductive hypothesis the quotient 
c/O must be finite. In particular this shows that NV is finite, and therefore M 
is finite. Since, by the inductive hypothesis, all the quotients 1/b with 6 © M 
are finite, and since every member of A except the element 0 belongs to at 
least one of these quotients, we conclude that 4 must be finite. 


Analysing the proof of the last theorem we can actually find an upper 
bound for the number of elements in an m dimensional sublattice A of a free 
lattice. We first prove by induction that A has at most m atoms. We simply 
observe that, in the notation used above, the atoms of the lattice quotient 
c/O0 are precisely the elements of V, and infer by the inductive hypothesis 
that NV has at most m — 1 elements. A second induction proves that 4 has 
at most 2-(n!) elements. For each element of A, except the element 0, is 
contained in one of the quotients 1/5 with 6 € M, and the fact that 0 belongs 
to none of these quotients is more than made up for since 1 belongs to all 
the quotients. Actually these estimates can be considerably improved. For 
instance, if m = 3, then A has at most 8 elements, and if m > 3, then 1 has 
at most  — 1 atoms. 

Finally, Professor R. Dilworth has observed that by a slight modification 
of our proof it can be shown that if a sublattice A of a free lattice satisfies 
the double chain condition, then A is finite. The set of all lattice quotients 
a/b of A, ordered by set-inclusion, satisfies the minimal condition, and one 
need therefore only consider the case in which A has the additional property 
that every quotient properly contained in A is finite. Under this assumption 
the finiteness of A follows as in the proof of (2.7). 











264 BJARNI JONSSON 


REFERENCES 


1. R. A. Dean, Sublattices of free lattices. 

2. R. Fraisst, Sur l'extension aux relations de quelques propriétés des ordres 
Norm. Sup. (3), 71 (1954), 363-388. 

- B. Jénsson, Universal relational systems, Math. Scand., 4 

4. P. M. Whitman, Free lattices I, Ann. Math. , 42 (1941), 325-330. 


5. P. M. Whitman, Free lattices IT. Ann Math. (2), 43 (1942), 104-115. 


University of Minnesota 


, Ann. Sci. 


, (1956), 193-208. 


Ecole 





»le 





DISTRIBUTIVE SUBLATTICES OF A FREE LATTICE 
FRED GALVIN anv BJARNI JONSSON 


The purpose of this note is to characterize those distributive lattices that 
can be isomorphically embedded in free lattices. If it is known (cf. (2)) that 
in a free lattice every element is either additively or multiplicatively irredu- 
cible, and consequently every sublattice of a free lattice must also have this 
property. We therefore begin by studying the class of all those distributive 
lattices in which this condition is satisfied. 

The notion of a linearly indecomposable lattice will play a fundamental 
role in these investigations. Given two non-empty subsets B and C of a 
partially ordered set A, we write B < C if and only if either B = C or else 
b < c whenever 6 € B and c € C. It is obvious that under this relation the 
non-empty subsets of A form a partially ordered set. A lattice A is said to 
be linearly indecomposable if there do not exist sublattices B and C of A 
such that A = BU Cand B < C. Clearly every lattice A is the union of a 
unique linearly ordered family @ of linearly indecomposable lattices. Fur- 
thermore, A is distributive if and only if each member of © is distributive, 
and in order for A to have the property that each of its elements is either 
additively or multiplicatively irreducible it is necessary and sufficient that 
each member of @ have this property. We therefore need only consider the 
case of a linearly indecomposable lattice. 


LEMMA 1. Suppose D is a distributive lattice with the property that every 
element of D is either additively or multiplicatively irreducible. If the elements 
X1, X2,X3 © D are such that no two of them are comparable, then they generate 
an eight-element Boolean algebra. 


Proof. Since the element 
(Xo + X3)(%3 + 2X1) (K%1 + Xe) = XoX3 + X9X1 + 1X2 


cannot be both additively and multiplicatively reducible, either one of the 
factors on the left must be contained in the other two factors, or else one 
of the summands on the right must contain the other two summands. By 
symmetry and duality we may assume that xox; and x3x,; are contained in 
X\Xe, so that 


(1) XoX3 = X3X1 < X1Xo. 


Received December 15, 1959. The results presented here were obtained while the first author 
was an NSF Fellow, and the work of the second author was supported by a research grant 
from the NSF. 


265 














266 FRED GALVIN AND BJARNI JONSSON 


Considering the element 
(xX, + X3) (Xo + 2X3) = X1X2 + Xz, 


we see that one of the following four inclusions must hold: 


(2) x1 + X3 [ Xo + Xs, Xe + X3 SX + Xs, XiX2 D> X3, Xs D XiX2. 
If the first inclusion holds, then x; = xyx2 + x,x3, and it follows by (1) that 
Xx, = X1X2 < X2, contrary to our hypothesis that x;, x2, x3; be incomparable. 


Similarly the second inclusion in (2) leads to a contradiction, and obviously 
so does the third. Finally, if x3 > x:xe, then the inclusion in (1) can be re- 
placed by an equality, and it follows that x;, x2, and x; are the atoms of a 
Boolean algebra with eight elements. 


LEMMA 2. Suppose D is a linearly indecomposable distributive lattice with the 
property that every element of D is either additively or multiplicatively irreducible. 
Then the width of D is at most 3. Furthermore, if the width of D is 3, then D is 
a Boolean algebra with eight elements. 


Proof. By Lemma 1, if the width of D is 3 or more, then D contains as a 
sublattice a Boolean algebra B with eight elements. Let z and u be the zero 
and the unit of B. We shall show that if d is an element of D which does 
not belong to B, then either d < z or d > u. 

First observe that if p is an atom in B, then there exists no element d € D 
such that z < d < p. In fact, if such an element d exists, and if g and r are 
the other two atoms of B, then the element 


q+d = (g+p)(¢ +d + rT) 


is both additively and multiplicatively reducible. 
Now consider any element d of D and let 


pb’ =2+ pd, g =2z+ qd, and r' =2+1rd 


where p, g, and r are the atoms of B. Then z < p’ < p, hence p’ = z or 
pb’ = p. Since p is multiplicatively reducible (in B and therefore also in D), 
it must be additively irreducible. It follows that if p’ = p, then p = pd < d. 
Similarly, either g’ = z or else g < d, and either r’ = zorr < d. By symmetry 
we need only consider four out of the eight cases that may arise. 

If p’ =q =r’ =2, then u(z +d) = z. Hence d < 2, for otherwise the 
element 


p+d == (p+d+q)(po+dt+r) 
would be both additively and multiplicatively reducible. 


, 


If p<dand 7 =r =z, then (¢q+d)u=q+du=q+p. Sinceg+p 
is multiplicatively irreducible it follows that 


gq+p=aqrtd, d=d(q+d) =d(q+ p) =dq+dp= pb. 





————en 


ie 





DISTRIBUTIVE SUBLATTICES 267 


If p<d,q<d, and r =z, then ud = p +4, hence d = p + g. 

If p<d,q<d, and r <d, then u < d. 

Thus we see that if d is not an element of B, then either d < z or else 
d> u. 

The element z is multiplicatively reducible and must therefore be additively 
irreducible, whence it follows that if there exist elements d € D with d < z, 
then the set A consisting of all these elements must be a sublattice of D. The 
set C = D — A is precisely the set of all elements d € D with z < d, and 
we therefore have A < C. We therefore see that if A were non-empty, then 
D would not be linearly indecomposable as required by the hypothesis. Simi- 
larly, the assumption that there exists d © D with u < d leads to a contra- 
diction, and we conclude that D = B. 


LEMMA 3. Suppose D is a linearly indecomposable distributive lattice with 
the property that every element of D is either additively or multiplicatively irre- 
ducible. If the width of D is 2, then D is isomorphic to a direct product of two 
chains, one of which has exactly two elements. 


Proof. We consider two cases depending on whether D does or does not 
have a zero element. In each case the proof will be divided into several parts. 


Case I. D has a zero element z. 


Statement la. There exists an atom p of D which is multiplicatively irre- 
ducible 


Proof. The zero element z must be multiplicatively reducible, for otherwise 
the set D — {z! would be a sublattice of D, and D would not be linearly 
indecomposable. Thus there exist p, ¢ D such that z = pg, z < p, and 
z <q. If neither p nor g were an atom, then there would exist x, y © D such 
that z <x < pand z < y <q, and the elements p, g, and x + y would be 
incomparable, which is impossible because the width of D is only 2. We may 
therefore assume that p is an atom. 

If p is multiplicatively reducible, p = ab with p < a and p < 4, then two 
of the three elements a, 6, and g must be comparable. Since ad is properly 
contained in a and in 6, a and 6 cannot be comparable, and since is not 
contained in g, neither a nor 6 can be contained in g. Therefore either a or } 
must contain g, and we can assume that gq < a. 

For any x € Dwithz <x < qwehavex + p = a(x + bd). Nowx <x+p 
and p < x + p. Also, the equality x + p = x + 6 is excluded because it would 
imply that 6<x+6=x+ p< a. We must therefore have x + p = a, 
q<x+p,qg=x+p¢ =x+2 =x. Thus g is an atom of D. 

If g is also multiplicatively reducible, g = cd with gq < c and g < d, then 
p is contained in either c or d, say p < c. Observe that b does not contain gq, 
and therefore contains neither d nor p + gq. Similarly, d contains neither } 
nor p + qg. Furthermore, 6(p + q) = p < 6 and d(p + q) = q <d, so that 














268 FRED GALVIN AND BJARNI JONSSON 


b + ¢q contains neither 6 nor d. Consequently 6, d, and p + g are incom- 
parable. This contradicts our hypothesis, and we conclude that either p or g 
must be a multiplicatively irreducible atom. 


Statement Ib. If p is a multiplicatively irreducible atom of D, then the set 
C={x|xE€D and px =s3} 


is a chain and an ideal of D, and D is the inner direct product of C and of 
the two-element chain C’ = {z, p}. 


Proof. Clearly C is an ideal of D, and if x,y € C, then either x < y or 
y <x, because otherwise the three elements x, y, and p would be incom- 
parable. Since C and C’ are ideals of D and have only the zero element in 
common, their inner direct product A = C’ X C exists and is an ideal of 
D. The proof will be completed by showing that if the set B = D — A were 
non-empty, then B would be a sublattice of D and A < B. 

Given x € B we have x ¢ C, whence px # z, and thus p < x. For all y € C 
we have x(y + p) = xy + p, whence it follows that p < xy or xy < p or 
x<y+pory+p<-x. The first case is excluded because py = z < p, 
and the third case is ruled out because it would yield x = xy + p € A. The 
case xy < p yields xy = 2, x(y + p) = p, and since p is multiplicatively 
irreducible it follows thaty +p =p, y<p,y=2,y+p = p <x. Finally, 
in the last case the equality y + p = x is ruled out since y + p € A. Thus 
b + y <x whenever x € B and y € C, whence it follows that A < B. 

Clearly, if x; € B and x; < xe, then x2 € B. To show that B is a sublattice 
of D it is therefore sufficient to show that if x;, x. € B, then x :x. € B. If 
this fails, then x,:x2 € A. Since every member of B contains p, we have 
p < x1%2, and therefore x;x2 = p + y for some y € C. But since x;x2 is multi- 
plicatively reducible, and is therefore additively irreducible, it follows that 
y = 2, p = X1X%2. However, this is excluded because p is multiplicatively 
irreducible. 


The next statement will be needed in the treatment of Case II below. 


Statement Ic. The set C in Ib consists of all the additively irreducible 
elements of D, except the element p. 

Proof. Since C is a chain, every element of C is additively irreducible in C, 
and since C is an ideal of D, it follows that every element of C is additively 


irreducible in D. On the other hand, ifa € D,a¢ C,anda # p, thena = p+y 


‘ 


for some y € C, and therefore a is additively reducible. 
Case 11. D does not have a zero element. 


Statement Ila. If z € D is multiplicatively reducible, then the dual ideal 
generated by z is linearly indecomposable. 


Proof. There exist a, 6 © D such that z = ab, z < a, and z < b. Let D,; be 








DISTRIBUTIVE SUBLATTICES 269 


the dual ideal generated by z, and suppose there exist sublattices A and B 
of D, such that D, = A.B and A < B. Clearly a,6 € A. If x € D and 
x ¢D,, then x <a or x < 5, for otherwise the elements a, 6, and x would 
be incomparable. It readily follows that if A’ is the set of all those elements 
x € D which are contained in some member of A, then D = A’ B and 
A’ < B, contrary to our hypothesis. 


Statement Ilb. Every element of D contains a multiplicatively reducible 
element. 


Proof. \{ x © D is not itself multiplicatively reducible, then either x is the 
largest element of D, or else there exists y © D such that x and y are incom- 
parable, for otherwise D would be the union of the two sublattices 


A=fyix>ye€E D} and B={y|x<yé€ED} 
with A < B. Thus xy < x, and xy is multiplicatively reducible. 


Statement IIc. The set A consisting of all the additively irreducible elements 
of D is a chain, and every member of A is covered by a unique member of 


D—- A. 


Proof. Suppose a,b € A. Since D does not have a zero element, there 
exist x, y © Dsuch that x < y < ab, and by IIb there exists a multiplicatively 
reducible element z with z < x. Let D, be the dual ideal generated by z. In 
view of Ila we can apply la, b,c with D replaced by D,. Let p and C be 
as in Ib. Then a # p # b because a and 6 do not cover z, and it follows by 
Ic that a, 6 € C. Since C is a chain, we conclude that a < 6 or 6 < a. Thus 
A isa chain. Finally, by Ib, a is covered by p + a and by no other additively 
reducible element. 


Statement II1d. Let A be the set consisting of all the additively irreducible 
elements of D, and for each a € A let a’ be the unique member of D — A 
that covers a. Then the mapping (0, a) — a, (1,a)— a’ is an isomorphism 
of the outer direct product {0,1} X A onto D. 


Proof. For each multiplicatively reducible element z of D let D, be the 


dual ideal generated by z and let A, = A ()\ D,. In view of Ila we may apply 
la, b,c with D replaced by D,. Observe that p = 2’ satisfies the hypothesis 
of Ib, and denote by C, the corresponding set C defined in Ib. Clearly A, € C,,. 


If zo and z; are multiplicatively irreducible elements of D with zo < 2, 
then we see by IIb that 2; = 29’ + 2; and 29 = 29'2;, and hence that C,, C Cy. 
Now suppose z is multiplicatively reducible, a © D, and a¢ A,. Then there 
exist b,c € D such that a = 6+ c, 6 <a, and c <a. We can then find a 
multiplicatively reducible element z9 with zo < bc. Then a is additively 
reducible in D,, so that a¢C,,. Consequently a¢C,. Thus we see that 


C,CA,, hence A, = C;. 














270 FRED GALVIN AND BJARNI JONSSON 


By Ib, the mapping (0, a) — a, (1,a) a’ = 2 +a is an isomorphism of 
{0,1} * C, onto D,. The lattices {0,1} & C, form a chain whose union is 
{0, 1} X A, and the lattices D, form a chain whose union is D. Consequently 
the indicated mapping is an isomorphism of {0,1} XK A onto D. 


THEOREM 4. For any distributive lattice D the following conditions are equiva- 
lent: 

(i) Every element of D is either additively or multiplicatively irreducible. 

(ii) D is the union of a linearly ordered family € of sublattices such that 
each member of © is either a one-elememt lattice or an eight-element Boolean 
algebra, or else 1s isomorphic to a direct product of two chains, one of which 
consists of exactly two elements. 


Proof. As we observed in the introduction, D is the union of a simply ordered 
family of linearly indecomposable sublattices. That (i) implies (ii) therefore 
follows from Lemmas 2 and 3, together with the obvious observation that a 
lattice of width 1 (a chain) is linearly indecomposable if and only if it con- 
sists of just one element. 

Conversely, it is easy to show that under the hypothesis of (ii) each mem- 
ber C of @ has the property that every element of C is either additively or 
multiplicatively irreducible, whence it follows that D also has this property. 


LemMA 5. Every simply ordered subset of a free lattice is denumerable.* 


Proof. Let F be a free lattice generated by a set X. The alternative case 
being trivial, we assume that X is non-denumerable. Let Xo be a denumerably 
infinite subset of X, and let Fy) be the sublattice of F generated by Xo. 

For a, 6 € F write a = 3 if and only if there exists an automorphism f/f of 
F such that f(a) = 6. Clearly = is an equivalence relation over F. For each 
a € F there exists a finite subset Y of X such that a belongs to the sublattice 
of F generated by Y. We can find a permutation p of X which maps Y into 
Xo, and p can be extended to an automorphism f of F. Consequently a = f(a) 

Fy. Thus every equivalence class modulo = contains a member of Fo. 
The number of equivalence classes must therefore be denumerable, and the 
proof will be completed if we show that no simply ordered subset of F con- 
tains more than one element from any one equivalence class. That is, it 
suffices to show that if a = b and a < 3, then a = b. 

Suppose f is an automorphism of F such that f(a) = 6. There exists a 
finite subset Y of X such that a belongs to the sublattice of F generated 
by Y. If Z is the image of Y under f, then there exists a permutation p of 
X such that p(x) = f(x) whenever x € Y, and p(x) = x whenever x € X — 
(Y UZ). If g is the autmorophism of F such that g(x) = p(x) whenever 


*A somewhat more involved argument can be used to show that if F is a free lattice and 
if Y is a subset of F with &, elements, where X&, is a non-denumerable, regular cardinal, then 
Y contains a subset Z with X, elements such that Z generates a free sublattice of F. 








va - 


nd 
en 








DISTRIBUTIVE SUBLATTICES 271 


x € X, then g(a) = f(a) = b, and g is of some finite order n. If now a < 3, 


then 
a< gla) < g*(a) <... & g(a) = 4, 


hence a < 6 < a, hence a = 6. This completes the proof. 


THEOREM 6. For any distributive lattice D the following conditions are equiva- 
lent: 

(i) D is isomorphic to a sublattice of a free lattice. 

(ii) D is isomorphic to a sublattice of a free lattice with three generators. 

(iii) D is denumerable, and every element of D is either additively or multi- 
plicatively irreducible. 

(iv) D is the union of a denumerable, linearly ordered family 6 of sub- 
lattices where each member of © is either a one-element lattice or an eight-element 
Boolean algebra, or else is isomorphic to a direct product of a two-element chain 
and a denumerable chain. 


Proof. Clearly (ii) implies (i) and, as we observed in the introduction, (i) 
implies that every element of D is either additively or multiplicatively irre- 
ducible. Using Theorem 4 and Lemma 5, we therefore see that (i) implies 
that D is denumerable. Thus (i) implies (iii). Since (iii) and (iv) are equiva- 
lent by Theorem 4, it remains only to prove that (iv) implies (ii 

If F is a free lattice generated by x, y, and z, then it is easy to check that 
the elements yz, zx, and xy generate an eight-element Boolean algebra. Also, 
F contains as a sublattice a free lattice F’ with five generators xo, x,, X2, Xs, X4. 
If C is a denumerable chain, then there exists an isomorphism f of C into the 
sublattice generated by x2, x3, and x4. Defining the mapping g of A 0, 1}*C 
into F’ by the conditions 


g({1,c)) = xo + xif(c), g((0,c)) = (xo + xif(c))x,, 


for all c € C, we shall see that g is an isomorphism of A into F’. 
Let h be the endomorphism of F’ such that 


h(xo) = 0, h(x;3)=1, and A(x, x for 1 23 4 
Then 
hg((1, ¢ ) = f(c) = hg | 0, c)) 
for all c € C. Consequently g is one-to-one on the set of elements of the form 


{1,c), and also on the set of elements of the form (0, c). Furthermore, if 
Cc, ¢ C, then g((0, c)) < x; and g({1, c’)) £ x1, so that g((0, c)) ¥ g((1, c’)). 
Thus g is one-to-one. 


If c,c’ € Cand c < c’, then it is easy to check that 


g({1, c)) + g((0, c’)) = g(<1, c’)), 
g({1, c))g((O, c’)) = g(0, c’)), 


and since g is obviously order-preserving, it follows that g is an isomorphism. 














272 FRED GALVIN AND BJARNI JONSSON 


Thus we see that, under the hypothesis of (iv), every member of © is 
isomorphic to a sublattice of a free lattice with three generators, and we 
conclude by (1, Theorem 2.4) that (ii) holds. This completes the proof 


REFERENCES 


1. B. Jénsson, Sublattices of a free lattice. Can. J. Math., 13 (1961), 256-264. 
2. P. M. Whitman, Free lattices I, Ann. Math. (2), 42 (1941), 325-330. 


University of Minnesota 








INTEGRATION OF SUBSPACES DERIVED FROM 
A LINEAR TRANSFORMATION FIELD 


EDWARD T. KOBAYASHI 


1. Introduction. The problem we study is a generalization of a problem 
first solved by Tonolo (6), then generalized successively by Schouten (5), 
Nijenhuis (4), Haantjes (3), and Nijenhuis—Frélicher (2). The Tonolo 
Schouten approach is distinct from that of Nijenhuis—Haantjes—Frdélicher in 
the sense that the former consider the problem on a Riemannian space, 
while the latter consider it on a manifold without any further structure. 

The object of investigation is the integrability of the distribution @ of 
vector subspaces 6, of the tangent space 7, to a manifold M, when @, is in- 
trinsically related to a given field A on M, of linear transformations h, on 7,. 
The research has so far been restricted to certain types of h. The result, under 
the weakest restriction, was that of Haantjes, which states that if h is of 
“type A’’* then all the distributions are integrable if and only if the following 
condition is satisfied: 


hh{h,h\(u,v) + [h,h|(hu,hv) — hlh,h\(hu,v) — h{h,h|(u,hv) = 0 


where u,v are two vector fields over M, and [h, A] is a vector 2-form intro- 
duced by Nijenhuis (cf. § 2). 

We free ourselves from any restriction on h. Our result depends entirely 
on the local factorization of the characteristic polynomial x of h. To each 
factor x; of x, there corresponds a distribution 6; and a projection operator 
e,;(h), which is a polynomial in h, and the local integrability condition of 6, 
is (J — e,(h))[e,(h), €,(4)] = 0 (Theorem 4.2). To each product x, ... x, of 
distinct factors of x, there corresponds a distribution 6, u- The necessary 
and sufficient condition for these distributions to be all locally integrable is 
[e,(h), €;(4)] = O for all i (Corollary 4.3). 


2. Vector forms and projection operators. Let M be a C’-manifold 
and @ the ring of C*-functions on M. By a neighbourhood of a point p in 
M, we mean an open, connected subset of M containing p. 





Received September 23, 1959. This research was supported by an Office of Naval Research 
contract at the University of Washington. 

*h is said to be of type A if (i) there are functions a, ..., a, on M, such that (a), 
are distinct at each p, and give the eigenvalues of h,, and if (ii) there are vector fields 
aE vim; on M,i =1,..., 2, mi +... +m, =m such that (v%1)p,..., (Vim;)p are eigen- 
vectors corresponding to (ai), and are linearly independent. 


273 














274 EDWARD T. KOBAYASHI 


Definition 2.1. A vector g-form is a C”-tensor field over M, skew-symmetric 
in the covariant part, of covariant degree g, and of contravariant degree 1. 
Let h be a vector 1-form. Then we see that / is nothing but a rule which 
assigns to each point p of M a linear transformation h, of the tangent space 
T, at p to M. Following Nijenhuis (4, 2) we introduce a vector 2-form [h, h] 
defined by 
(2.1) $[h, h)(u,v) = [hu, hv) + hh{u,v) — h{hu,v] — blu, ho), 
where u,v are vector fields over M. That (2.1) does define a tensor, follows 
from the ®-linearity in wu and v of the right side of (2.1).* 


Definition 2.2. A vector 1-form e satisfying e? = e on a neighbourhood U 
is called a projection operator on U. 

Remark 1. dim e,7, is constant for g € U, and we call this constant the 
rank of e. In fact, dim e,7,, which is an integer, is equal to the trace of e,, 
which depends continuously on g, hence is a constant. 


Remark 2. If e is a projection operator on U, so is e’ = I — e, where TJ is 

the identity vector 1-form. We have e + e’ = J, ee’ = ee = 0 and 
T, = Gel DB Cele for q € U. 

Furthermore we have 

(2.2) fe, e] = [e’, e’]. 

Definition 2.3 A law @ which assigns to each point p in a neighbourhood U 
of M, an r-dimensional vector subspace 6, of the tangent space 7, of M at p, 
is called an r-dimensional distribution over U. If at each p € U, we can find 
a neighbourhood U’ of », U’ contained in U, and r C”-vector fields X;,..., X, 
over U’, such that (X,),,..., (X,), form a basis for 6, at each g € U’, we 
say that 6 is C™. 

Definition 2.4. Let 6 be an r-dimensional C®-distribution over a neighbour- 
hood U of p. If there is a neighbourhood U’ of p and, for each g € U’, an 
r-dimensional submanifold N contained in U’ and passing through g, such 
that @,- is the tangent space of NV at each g’ € N, then we say that 0 is integ- 
rable in U', a neighbourhood of p. 


Definition 2.5. Let @ be a C*-distribution over a neighbourhood U of p. If 
there is a neighbourhood U’ of » contained in U such that, for any two 
C”-vector fields X,, X2 over U’, satisfying (X1),, (X2), € 6, (¢ € U’), we have 
[X1, Xe], © 9, then we say that @ ts involutive in U’ 

A C*-distribution @ over a neighbourhood U of is integrable in a neigh- 
bourhood U’ of p, contained in U, if and only if @ is involutive in the neigh- 
bourhood U’ of p, Frobenius (1). 

Now, if e is a projection operator of rank r over U, then 6, defined by 
q—e,7,, where T, is the tangent space of M at g € U, is an r-dimensional 


*For details of this type of argument, see the proof of Proposition (3.4) in (2). 





— 


Se 


SC 








INTEGRATION OF SUBSPACES 


to 
~I 
uo 


C*-distribution over U. To see that @ is C”, choose a co-ordinate system 
%1,...,%, in a neighbourhood of g. Then we can pick r C”-vector fields from 


is er 3 
say, 

, 2 > 

oe he a 
so that 





‘= a 
OX1/ @’, 2... OX} @ 


are linearly independent, hence form a basis for e,-7,, for g’ in a neighbour- 
hood of g. 


LEMMA 2.1. Let e be a projection operator over a neighbourhood U of p, and 
let 0 be the C”-distribution defined by q — e,T,, gq © U. Then 0 is integrable in 
a neighbourhood of p, if and only if (I — e)|e, e] = 0 on a neighbourhood of p. 


Proof. lf u,v are two C”-vector fields over a neighbourhood of p, then 
we have 
(I — e)[e, e](u, v) 

= (I — e)[eu, ev) — (J — ejeleu, v] — (I — ejelu, ev] + UJ — ee*[u, v] 
(I — e)[eu, ev]. 

If u isa C’-vector field over a neighbourhood U’ of p, then eu is a C’-vector 
field over U’ such that e,u, © e,T,, g © U’. Conversely, if u is a C”-vector 
field over U’ such that u, € e,7,,g © U', then u, = e,u,, hence u = eu. Hence, 
using Frobenius’ theroem, we see that @ is integrable in a neighbourhood of p 
if and only if [eu, ev], € e,7, for all g in a neighbourhood U” of , and all 
C”-vector fields u, vover U’’. This condition is equivalent to (J — e)[eu, ev], =0, 
and the computation above shows that the latter in turn is equivalent to 
(I — e)[e, e](u, v), = 0. Q.E.D. 


If e,, 7 = 1,...,g are projection operators on U, p € U, satisfying 
g 
dD e=1, ets = ef, 
i=l 
then it can be shown that ee, = 0 for i ¥ j, and that 7, = (e;),7,®...@ 
(€,)¢7, for g € U. Let 6, ... 4 be the C*-distribution over U defined by 


q-> (Cael @®...@ (Cu)el ¢ 


Here 7;,..., % should be all distinct. 

If 6; ... 9-1 and 62... _, are both integrable in a neighbourhood of p, then 
using Frobenius’ theorem, we see that 62 __ . ,-; is integrable in a neighbour- 
hood of ». Repeating this argument, we have: the distributions 


Oy... 08 Ooi FON EOL cm ee t,...,8 











276 EDWARD T. KOBAYASHI 


are all integrable in a neighbourhood of p if and only if the distributions 
6 .. . tg, are all integrable in a neighbourhood of p. 


LEMMA 2.2. The distributions 
eee ee ere: 2 ae oe eee ae 2 


are all integrable in a neighbourhood of p if and only if {e;, e;] = 0 for all i in 
a neighbourhood of p. 
t 
Proof. Notice first that e; +..A.. +e, = J — e,. Using Lemma 2.1. and 
(2.2) the integrability of the distribution 6,.., .., can be expressed as, 
e,e;, ex] = 0. 


Now if all the distributions @;,... , are integrable in a neighbourhood 
of p, then in particular 4, _ . A ..9 and 6, are integrable in a neighbourhood 
of p, so we have e,[e;, e,] = 0 and (J — e,)[e;, e;| = 0, and hence [e,, e,] = 0 
on a neighbourhood of p. 


Conversely if [e;, e;] = 0 on a neighbourhood of p then of course e,[e;, e;] = 0 
on a neighbourhood of p, thus the integrability of 6, .. 4 . . ¢. 

3. The characteristic polynomial of a vector 1-form. Let h be a 
vector 1-form on M and let A be an indeterminate. Suppose {x,,...,x,} is 
a co-ordinate system in a neighbourhood of p in M. h has components h,/ (x) 
in this neighbourhood, where 


F] ra] 

(3.1) h— = >> hi'(x) — 
We can consider x = det!|Ad,/ — h,’(x)||, which is a polynomial in \ of degree 
n with coefficients which are C”-functions of (x1,...,%,). It is easy to verify 
that the coefficients do not depend on the choice of the co-ordinate system, so 
we have an element x in ®[A]. x is called the characteristic polynomial of h. 

PROPOSITION 3.1. Suppose x,, the characteristic polynomial of h,, has a 
factorization over the reals R: 
(3.2) marie .. Ke 
where K, © Rid], with leading coefficients 1, and K, are all distinct and irreduc- 
ible over R. Then there is a neighbourhood U of p, where x has a unique factorization 
(3.3) X=x1-...-x,on U 
satisfying 


(i) xi € &yfA], where Dy is the ring of C”-functions on U; 
(ii) x, has leading coefficient 1, deg x; = deg K,"*; 
(ili) (xa)p = Ki; 
(iv) (xa_ and (x;), are relatively prime for q © U,i # j. 





a Ga & 


ee 


sO 


iC- 
on 





INTEGRATION OF SUBSPACES 277 


For the proof, we apply the following lemma repeatedly. 


LEMMA 3.2. Let @ € ®[A] with leading coefficient 1. Suppose for a point p in 
M, ¢, = PQ; where P,Q € RA] (R = the real numbers), with leading co- 
efficients 1, and relatively prime to each other. Then there is a neighbourhooa U 
of p, and unique pw, r € PylA] (Sy = the ring of C”-functions over U), with 
leading coefficients 1, such that ¢ = ux holds over U, and up, = P, x, = Q. More- 
over U can be so chosen that yu, and x, are relatively prime at each q © U. 


Proof. Let the degree of P and Q be k and / respectively. Let x,(i = 1,..., k), 
yj =1,...,), &(s =1,...,2+) be variables, and xo = yo = 2 = 1. 
For {x1,...,%Xe}, {¥1,---, Va}, and {2,,...,Ze4:}, we write x, y, and z re- 


spectively. Let P(x), Q(y), and F(z) be polynomials in \, defined by 


k i 
(3.4) P(x) = D xa" QO) = DL ya’, 


and 


k+l 


F(s) = >> 2,**"". 


s= 


Let us take k + / functions of x, y, z defined by 


(3.5) G,(x,y,2)=—2,+ dD xy, (s=1,...,k+)). 
i+j=s 
Finally, let ao(= 1), @1,...,@e; bo(= 1), b1,.-. 1 by; Col = 1), Cr. ~ 5 Crys be 
real numbers such that P(a) = P, Q(6) = Q, o = F(c). PQ = % is now 
G,(a, b,c) = 0, s=1,...,8 +1. 
The Jacobian 
(Gi, ... 1 Gesi) 


d(x, y) 





J(x,¥;2) = 


has the form (3.6), which is nothing but the resultant of the two polynomials 
in A, P(x) and Q(y). 


I 1 
y #1 0 a 
y vy, | Ze x, «CI 0 
So fe &% | 
l ‘ 

: M1 «(X, , | 
(3.6) J(x,y;3z) =|. x; l 

v1 , x, xy 

yi x 





y t x 











278 EDWARD T. KOBAYASHI 


As P(a) = P and Q(6) = Q are relatively prime, we have J(a, 5; z) ¥ 0. 
In particular J(a, 5; c) ¥ 0. 

Furthermore, as G,(a, b,c) = 0, s = 1,...,%+1, we can use the implicit 
function theorem to find (i) an open neighbourhood V of c in R**', the 
(k + 1)-dimensional euclidean space, and (ii) a unique set of C®-functions 
fog,t=1,...,k%, 7 =1,...,1, defined on V and satisfying (A) and (B): 


(A) G,(f/i(z),...,fe(z), gi(z),.-.,22:(2), 3) = O for 2 € V 
(B) filo) = a, gj(Cc) = bj, 4=1,...,k83 7 =1,...,1. 


Now, let 


k+l 


d aie i a . 
— 
where ¢, € ®, ¢ = 1. By ¥ we denote the C*-mapping M — R**' defined 
by ¢— (¢1(@),..-,¢x+:(¢)). Take U to be the connected component of 
¥~'(V), containing p. If we let a, = f;o fy and 8, = g,oy¥, then our desired 
elements of ®,[A] are 


and 
I 
T= Ss B x" 
where ap = fo = 1. 
As J(a,6;c) #0 and as J(qg) = J(ai(q),..., a;(q), 6i(g),..., 8,(q); 


$1(qg), .. - » Oe+2(G)) is a continuous function of g in U, we can take a neigh- 
bourhood U’ of p, contained in U, such that J(g) ¥ 0 for g € U’. Then for 


q € U"” we = P(as(q),..., ax(g)) and r, = Q(B1(q), .. . , 8:(g)) are relatively 
prime. Q.E.D. 


Remark 1. If we let (3.2) to be the factorization of x, into irreducible factors 
over the complex numbers C, then all A, are linear in A, and we obtain 
x: € Sy[A], where $y is the ring of complex-valued C®”-functions over U. 
However, this result does not appear to be necessary for our purpose. 


Remark 2. If m,; > 1, one might expect to obtain a further factorization of 
Xi = XuXies Xt Xv #, [A] 


for a neighbourhood U’ of p, contained in U. But the following example 
shows that this is not necessarily the case. 

Let ¢ be a polynomial in A, with coefficients depending on two real parameters 
x and y, and having the form 


(3.7) @ = A\* — Qed? + (x? + y?). 





of 


le 


rs 





INTEGRATION OF SUBSPACES 279 


Then ¢:0,0) = 0 has A = 0 asa root of multiplicity 4. The solution of ¢;,,, = 0 
has four roots + r'(cos $@ + isin }@), where x = rcos 0, y = rsin 6, and we 


pick fixed branches for cos $@ and sin $0. So (3.7) has a unique factorization 
over R at any point (xo, yo) ¥ (0, 0) 

(3.8) @= ("- 2r} cos S00X + ro) (X + 2rk cos 400 X + ro). 

If we want to extend this factorization over a small neighbourhood of (xo, yo) 
we have 

2 _ oy4 2 1 Oy) 

(3.9) @ = (A? — 2r? cos $0\ + 7) (A* + 2r? cos $0A + 1). 

This extension is uniquely determined by requiring the coefficients in the 
factors of (3.9) to be continuous in a small neighbourhood of (xo, yo). However, 


(3.9) will not give a factorization in a neighbourhood of (0,0) because in a 
neighbourhood of (0,0), cos $@ is not a single-valued function. 


Remark 3. If in (3.3) we have x; = (A — a,)"* for some i, then (x,), 0 
has only one root of multiplicity m,, for g € U. If x; = (2+ 6A 4+ 6/)"* 
for some 7, then (x,), = 0 has two distinct complex roots, each of multiplicity 
m,, for g € U’, where the neighbourhood U’ is chosen sufficiently small with 
p © U’' CU. In both cases it is easy to see that a, € by or B,, By, © 
(for example, by expanding (A — a,)"* or (A? + BA + B,’)"* and using the 
fact that the coefficients in the expansion are C”-functions on U), 

Remark 4. Although it may not be possible to factor x, any further into 
polynomials in \ with C”-coefficients over some neighbourhood, it is well 
known that the roots of (x,), = 0, for each i, are continuous (multivalued) 
functions of g. In particular, the roots of (x,), = 0 are close to those of 
K, = 0 if q is close to p. 


4. Integration. Let A be a linear transformation on a finite dimensional 
vector space V over the reals R. Let \ be an indeterminate, and consider V 
as an R[A]-module by letting Fv = F(A)v for F © R[A], v © V, where, if 


m 


F = > a,r", 


t=1 
F(A) denotes the linear transformation 


m 


> aA‘. 


i=1 
Let K be the characteristic polynomial of A, and suppose K = FG where 
F,G € Rid]; deg F, deg G < deg K; F and G have leading coefficients 1 and 
are relatively prime over R. Then there exist unique P,Q © RIA], with 
deg P < deg G, deg Q < deg F, satisfying 
(4.1) PF + QG = 1. 
Because KV = 0, we have from (4.1), (PF)*v = (PF)o for all v V, Let 
Ve = (QG)V and Vg = (PF)V, then we have 
(4.2) V = Vr @ Ve. 














280 EDWARD T. KOBAYASHI 


It is also easy to see that, Vy = {v € V | Fv = O} and Vg = {v € V| Gu = 0}. 
In fact let Vp’ = {v © V| Fo = 0}. Then, as FVp = 0, we have Vp C Vy’. 
Conversely, ifv € Vr’, then0 = (PF)v = (1 — QG)v, hence v = (QG)v € Vp, 
so Vy > Vr’. Finally, F is the characteristic polynomial of A|V», and 
dim Vy = deg F; G is the characteristic polynomial of A|V.¢, and dim Vg = 
deg F. 

Now, if we take A to be h, and B to be 7, in the argument above, (4.2) 
gives a decomposition of 7,. We want to extend this decomposition to each 
T,, for q in a neighbourhood of ~, with the help of the factorization (3.3) of 
the characteristic polynomial x of h. For this purpose we first prove a lemma. 


LemMMA 4.1. Jf @ and y are elements of [| with leading coefficients 1 and 
degree k and | respectively, and if at each point g in a neighbourhood U, ¢, and 
¥, are relatively prime, then there exist unique uw, r © Py[d] of degree < 1 — 1, 
k — 1 respectively, satisfying 
(4.3) uo +av=1 over Uz 


Proof. Let 


k i 
=D ad iv= > Ba 
i=0 i=) 
where a;, 8; © ®, and ap = Bo = 1. Let 
I k 
= Z. —. r= b a. 
i=1 i=] 


Substituting these expressions in (4.3), we see that finding the required yu, 
is equivalent to solve (4.4) for the uw; and z;,'s: 


tarer a ~* l<p<k+/l-1 


i+ j=? i+ j=p 


4.4 
( ) Op, + Bim, = 1. 


The determinant D of the coefficients of the left member of (4.4) is 








l I 
0 0 
a l By l 
; l P | 
(4.5) D => ay ay B, By 
0 . 0 , ‘ 
a, B, | 


D,,q € U, is the resultant of two polynomials ¢,, ¥, in R[A], and as ¢, and 
y, are relatively prime, we have D, + 0. Hence we can solve (4.4) for wu; and 
x, over U (the solution is unique) and find them as rational functions of a; 





d 
d 








INTEGRATION OF SUBSPACES 281 


and 8;, with non-zero denominator over U. Hence yu, 7; $,. Thus 
u,® © Spy[A] are uniquely determined. Q.E.D. 


Now, if x is the characteristic polynomial of / and if 


X = X1X2---X, over a neighbourhood U of p 


t 
is the factorization (3.3), then x, and xX; = x1.. A ..x, are relatively prime 


at each point of U. By Lemma 4.1 we have y,, 7; € &y[A] satisfying 
(4.6) Mixi t+ 7X1 = 1 on UV. 


As before, using x,7, = 0 for g € U, and (4.6), we see that [(#,%,) (A) 
(x1) (2). Let us denote 7;X%; € By[A] by €,, and (#,X%,)-f, by T,(x,). Then 
e,(t) is a projection operator on U, and dim 7,(x,) = deg x,. As T, = T, 
(x,) ®... @ T,(x,), we have 
g 
> e(h) = 1. 
i=1 
Furthermore 6, defined by g — 7,(x,), g € U, is a C’-distribution over U. 
Using Lemma 2.1 we have: 
THEOREM 4.2. The distribution 0, is integrable in a neighbourhood of p if and 
only if 
(4.7) (I — e,(h))[e (kh), €,(4)] = 0 
holds on a neighbourhood of p. 
As in § 2, if we define 
bx ion ie by gq T,(xa) o...@ To(Xt)s 
we have, by Lemma 2.2: 
COROLLARY 4.3. The distributions 6 ;, “a 


s=1,...,) are all integrable in a neighbourhood of p, if and only if 
fe,(h), €,(h)] = 0 holds on a neighbourhood of p for all i. 


im Bos go = 5-6, = i, 


The important feature of the projection operator ¢,(#) is that ¢,(#) is a 
polynomial in h with coefficients in @y. This property essentially characterizes 
e,(h), as shown below. 


PROPOSITION 4.4. Let the characteristic polynomial x of h have the factorization 
(3.2) x» = K,™...K," at p, and (3.3) x = x1... x, on a neighbourhood U 
of p; and let €,(h) be the projection operator on U corresponding to x,. If e is a 
projection operator on U such that e = e(h), € © ®y[A], then on U we have 


g 
(4.8) e= eh) = } 5¢,(h) where & =O or 1. 


i=1 


First we prove a lemma. 











282 EDWARD T. KOBAYASHI 


LemMA 4.5. Let A be a linear transformation on a real vector space V of finite 
dimension, and suppose that the characteristic polynomial of A is of the form 
K™, where K € RIA] ts irreducible over R. Then if for P € R{d), P(A)? = P(A), 
then P(A) = 1 or 0. 


Proof. Let Q = 1 — P, then Q{A)? = Q(A). If P(A) # I and P(A) #0, 
then V has a decomposition V = V,; @ V2; Vi, Ve ¥ {0}, where V; = PV, 
V2 = QV. As 


P(A|V2) = P(AQ(A)) = P(A)Q(A) = 0, 


P is divisible by the minimal polynomial of A| V2, which in turn is equal to 
K™ for some m’, 1 < m’ < m. Hence P is divisible by K. Similarly Q is 
divisible by K. But P and Q = 1 — P are relatively prime, so we have a 
contradiction. Q.E.D. 


Proof of the Proposition 4.4. As e is a polynomial in /# with coefficients 
in ®y, eg7 (xs) C Ty(xi). We can define projection operators e; over U by 
letting e, = e€,(h). Then ee, = 0 for i # j and e = 3°4_, & over U. We want 
to prove either e; = 0 or e; = €,(k) for each 7. 

We first notice that e;| 7,(x,) = «(4| 7,(x)), and that h| 7,(x,) has 
characteristic polynomial K ,;"*. Hence, using Lemma 4.5, we see that either 
e:(T>(xa)) = {0,} or e:(7,(xi)) = To(x,). But, as e,(7T,(x;)) = {0,} for 
j #%i,q€ U, and, as e; has constant rank over U, we conclude that (i) if 
e,(T>(x,)) = {0,} then e,(7,(x,)) = {0,} for all ¢g € U, and that (ii) if 
€:(T>(xa)) = Ty(xs) then e:(7T,(x,)) = T,(x:) for all g € U. In the first case 
e; = 0; in the second case e; = €;(#). Q.E.D. 





REFERENCES 


1. C. Chevalley, Theory of Lie groups I (Princeton, 1946). 

2. A. Frélicher and A. Nijenhuis, Theory of vector-valued differential forms I, Proc. Kon. Ned. 
Ak. Wet. Amsterdam A 59 (3) (1956), 338-359. 

3. J. Haantjes, On X,,-forming sets of eigenvectors, Proc. Kon. Ned. Ak. Wet. Amsterdam A 
58 (2) (1955), 158-162. 

4. A. Nijenhuis, X,_; forming set of eigenvectors, Proc. Kon. Ned. Ak. Wet. Amsterdam A 54 
(2) (1951), 200-212. 

5. J. A. Schouten, Sur les tenseurs de V, aux directions principales V,.; normales, Coll. de 
Geom. Diff. Louvain, avril (1951), 11-14. 

6. A. Tonolo, Sulle varieta riemanniane a tre dimensioni, Pont. Accad. Sci. Acta, 13 (1949), 
29-53; Atti Accad. Naz. Lincei Rendi. Cl. Sci. Fis. Mat. Nat. (8), 6 (1949), 438-444. 


University of Washington 











THE LEBESGUE CONSTANTS 
FOR REGULAR HAUSDORFF METHODS 


LEE LORCH anp DONALD J]. NEWMAN 


1. Introduction. The unboundedness of the sequence of Lebesgue con- 
stants (norms), at a point, of certain transforms implies, as is well known, 
that there exist (i) a continuous function whose transform fails to converge 
to the function at the point in question (the du Bois-Reymond singularity), 
and (ii) another such function whose transform, while converging everywhere 
to the function, does not do so uniformly in any neighbourhood of the stipu- 
lated point (the Lebesgue singularity). The converses also hold in our case. 

The magnitude of such constants is, consequently, of some interest and 
has been calculated for many transforms. 

Here we are concerned with the Lebesgue constants L(m; g) arising from 
the application to Fourier series of the regular* Hausdorff summation method 
with weight function g(t), 0 < ¢ < 1. The function g(t) is of bounded varia- 
tion,f continuous at the origin, with g(0) = 0 and g(1) = 1. The general 
properties of such methods are elaborated in (3, chapter x1); specific appli- 
cations to Fourier series are found in (5). Among the important particular 
cases of Hausdorff methods are found the Cesaro, Hélder, and Euler means. 

Our primary purpose here is to establish the following: 


THEOREM 1. Let L(n;g) denote the nth Lebesgue constant for the regular 
Hausdorff method with weight function g(t). Then 
(1) L(n; g) = C(g) log nm + o(log n) ast n— ©, 
where 


(2) C(g) = (2/x*)|g(1)—g(1—)|+ (1/7). a > [g(&+) —g(&—)] sin Ex t 


Here & is the kth discontinuity (jump) of g(t) and the summation extends over 
all such (possibly countably infinite) values; .M\f(x)} represents, as usual, the 
mean value of the almost-periodic function f(x). Furthermore, 


(3) 0 < C(g) < (4/x*) V(g), 
where V(g) is the total variation of g(t),0 < t < 1, and 
(4) C(g) = 0 if and only if g(t) is continuous. 


Received January 22, 1960. Some of this work was done while the first-named author was 
supported partially through a (U.S.) National Science Foundation research grant, NSF-G 3663. 

*A summation method is “regular’’ if it sums the sequence s;,...,5,,..., to the (finite) 
value s whenever s, — s; ‘‘totally regular”, if, in addition, this is the case when s is ~. 

tWhen the method is totally regular g(t) is non-decreasing, and conversely. 

tThroughout this paper all o- and O- terms are taken as the parameter becomes infinite. 


283 











284 LEE LORCH AND DONALD J. NEWMAN 


If, in addition, the method is totally regular (so that V(g) = 1), then also 


(5) C(g) = 4/x* if and only if the method is ordinary convergence. 
and 
(6) C(g) < 2/x? when g(1—) = g(1) and, in this case, 


+4 


C(g) = 2/x? if and only if the method is of Euler type. 


Thus, C(g) is a constant depending only on the weight function g(é). 

Equation (4) shows that any Hausdorff method with a discontinuous weight 
function exhibits the du Bois-Reymond singularity (a result obtained originally 
by Hille and Tamarkin (5, Theorem 14.1), whose proof is along different 
lines) and also the Lebesgue singularity. 

Moreover, (5) and (6) show, respectively, that, among all totally regular 
Hausdorff methods, ordinary convergence has the maximum principal term 
for the Lebesgue constants and that the Euler methods possess the same 
extremal property in the class of totally regular Hausdorff methods with 
weight functions continuous at ¢ = 1. 

Plainly, (4) does not imply that a Hausdorff method with a continuous, 
or even absolutely continuous, weight function sums the Fourier series of a 
continuous function everywhere, as the remainder in (1) can be unbounded. 
Indeed, Hille and Tamarkin (5, p. 534, Remark 2, also pp. 538 and 568) 
supplied examples showing that absolute continuity of the weight function 
is neither necessary nor sufficient for the effectiveness (for continuous func- 
tions) of the method. 

That absolute continuity is not sufficient we prove anew here by showing 
that the error term o(log m) in (1) is “best possible’’ and cannot be improved 
even for the case of an increasing absolutely continuous g(t). More precisely: 


THEOREM 2. Let e(m) | 0 as n— ~. There exists an increasing, absolutely 
continuous weight function g(t) for which L(n; g) ¥ o(e(n)log n). 


In addition, we consider also the special cases of Cesaro and Hélder means 
of positive fractional order. Here the weight functions are absolutely con- 
tinuous. For (C,a) (3, p. 266), 

(7) gc(t) = 1— (1 — 0, a> 0, 
and for (H, a), 


v1 
(8) ga(t) = [1/T(a@)] j (—log x)*~* dx, a> 0. 


We shall supply a new proof of the result due to H. Cramér (2) arising 
from (7) and obtain the analogous statement concerning (8), together with 
a relation between the two: 


THEOREM 3 (Cramér). Jf g(t) is given by (7) and 0 < a < 1, then lim L(n; g) 
exists (n ©) and equals 
| 


(9) L(C.) = (2/7) f ve f (1 — tx*)* "sin t det| dx. 








iS 
)- 


) 





THE LEBESGUE CONSTANTS 285 
Moreover, L(C,) is a non-increasing function of a with L(C,) > 1; also, 
L(C.) ~ 1 as a—1-—, and L(C,) ~ + © asa—-0+. 


THEOREM 4. If g(t) is given by (8) and 0 < a < 1, then lim L(n; g) exists 
(n—@) and equals 


(10) L(A) = (2/2) f-|a/701 foe xt *)*—" sin t dt| dx. 


Moreover, 

(10’) L(C.) < L(Ha) 

and, for 0<a<6 <1, 

(10°) 1 < L(Cs) < L(C,); 1 < L(A) < L(A), 


with L(H,) ~ 1 as a—1-—, and L(H,) ~ +° as a—0+. 


The methods (C,a) and (H, a), a > — 1, are well known to be ‘‘equiva- 
lent,”’ as Hausdorff showed (3, p. 264), but not “totally equivalent’”’ (1. Schur, 
cf. (3. p. 119), and Basu (1)).* 

Basu (1) proved that, for 0 < a < 1, each sequence evaluable (H, a) to 
(finite or) infinite s is summable (C, a) to the same value and that the con- 
verse is not true for infinite s. Thus, (C, a) is slightly stronger than (H, a) 
for these a. 

Inequality (10’) illustrates further this same imbalance, which is found 
also in the Gibbs phenomenon: In the (H, a) method the Gibbs phenomenon 
persists for larger values of a, as O. Sz4sz (8) found, than in the (C, a) method, 
whose Gibbs phenomenon was discussed first by Cramér (2). 

Our common point of departure for the proofs of all four theorems is a 
formula for L(n; g) due to Livingston (7, p. 310 (3)), who used it to obtain 
a more precise version of (1) for methods of Euler type, which are Hausdorff 
means with one-jump step functions as their weight functions. His formula 
reads: 


Is 1- 
(11)¢ L(n;g) = (2/7) x [1 — 4¢(1 — 2) sin*x]}™ sin 2nxt dg(t) 
0 0 





sin(2n + 1)x| 


+ [g(1) — g-)]- Ko | dx + o(1). 


2. Preliminary lemmas. We use a simpler version of (11), obtained at 
the sacrifice of some precision, having an error term which is o(log m) instead 
of o(1) as above. Then we split g(#) into its continuous and pure jump com- 
ponents. The continuous part will be shown to contribute o(log ), while 


*Two methods are “equivalent” if each sums a sequence to the (finite) value s whenever 
the other does; “totally equivalent” if, in addition, the same is true for s infinite. 

tIt should be noted that the upper limits of the Stieltjes integrals in (11) and (12) are not 
the same, being 1— in the former and 1 in the latter. 











286 LEE LORCH AND DONALD J. NEWMAN 


the pure jumps give rise to C(g)log n + o(log m). This will complete the proof 
of the basic portions of Theorem 1. 
The necessary lemmas form the content of this section. 


LEMMA 1. 





eni/2 21 | 
(12)* L(mn;g) = 2/x) | "| sin xt dg(t)| dx 
1 0 | 
+ (2/n*)|g(1) — g(1—) log m + o(log n). 
Proof. Replacing the factor {sin(2m + 1)x}/sinx by {sin 2mx}/x in (11) 
induces a bounded error, so that Livingston’s formula, weakened slightly, 
can be written as 


(13) L(n; g) = (2/7) j x“ |K, (x) \dx + O(1), 
where 
1 
(14) K,(x) = f {1 — 4¢(1 — 2) sin*x]*" sin 2nxt dg(t). 
Jo 


We decompose L(m; g) and consider for fixed « and 4,0 <e<1<A, 


ve/ni/2 
f x "|K,(x)| dx, 


I,(n) = j 
2 Ajni/2 
I:(n) = | K,,(x)| dx, 
and 
I;(n) = j x '|K, (x)| dx 
oY Aini/2 


As to I,(n): Here 0 < x < «/n' and so 
1 > [1 — 4¢(1 — 2)sin? x] > [cos? x] = cos*x > 1 — &, 
whence 


| | v1 | | F 
| |Kn(x)| ot j sin 2nxt dg(t)| | < € V(g), 


while, trivially, 


el | 
fay -- | sin 2nxt agi) < (4nx) V(g). 


( 


Hence, 


e/ni/2 | 1 
I,(n) = f x | sin 2nxt dg(t) | dx + Eo, 


*It should be noted that the upper limits of the Stieltjes integrals in (11) and (12) are not 
the same, being 1— in the former and 1 in the latter. 








THE LEBESGUE CONSTANTS 287 


where 
e/ni/2 


e/n 
|Eo| < vie) f x *(4nx)dx + vie) f x "e’dx = (4¢ + }¢’ log n) V(g). 
0 « 


/n 
1/(2n1/2) 
—1 
J ° 
ejni/2 


so that, replacing 2mx by x, 


ni/2 | 1 
(15) I,(n) -{ x] f sin xt dg(t)|dx + F,, 
1 Jo 


Finally, 





1 
f sin 2nxt dg(t) ax < [log(1/e)]V(g), 
Jo | 


with 
|Ey| < [4e + $e? log m + log(1/e) + 1] V(g), 


where the 1 has to be added because the portion of J;(m) going from 0 to | 
has been dropped. 
As to I2(n): Since |K,(x)| < V(g), we have 


(16) 0 < Ie(n) < [log(A/e)|]V(g). 
As to I;(n): Here it is convenient to decompose K,(x) by writing 
(17) K,(x) = 
+1 
[g(1) — g(1—)] sin 2nx +{ [1 — 4¢(1 — ¢) sin*x] sin 2nxt dg(t). 
0 
For }r >x >A n', (sin x)/x > 2/x, so that 


[1 — 16A%-%4(1 — t)n-*]" 


[1 — 44(1 — é)sin® x} < [1 — 4¢(1 — £)(4A*)/(9?m) | 
< exp{ — (8/2?) (A*t)(1 — 2)}, 


since (1 — k)* T e. 
Hence, the integral on the right in (17) is dominated in absolute value by 


7 
(18) o(A) -{ exp{ —(8/2*) (At) (1 — t)} dig(t)|. 
0 


This approaches zero as A —@, from the dominated convergence theorem, 
since g(0+) = g(0) = 0. 


Thus, 
whe , 
(19) I;(n) = |\g(1) — gU—) | x \sin 2nx|\dx + Es, 
Y Aini/? 

where, for all large n, 

ohr 

|E3| < (A) | x ‘dx < $(A) log n. 
Aini/2 


Furthermore, 











288 LEE LORCH AND DONALD J. NEWMAN 


f* lsin 2nx ae i. |sin 2mx| — (2/7) d 
Ain/? 


Alni/2 x T 8 2A 





(1/7) log n + Ey, 
where, since A > l, 

\E,] < log A + C, 
with* 


C= sup 
V>Uei1 





{: |sin ¢] — (2/2) a 
U t 


Now, 


L(n;g) _ 2 
logn = I 





) + In(n) + Ia(n)] + (4), 


and so, from (15), (16), and (19), 


| 9 ni/2 1 1 | 
lim sup ‘ston -* J J sin xt dg(t) | dx 


‘ 2 L(n; L(n; g) 
> a=» = om» 
+ (2/m°)\g(1) — g—)| lon # 











< (2/4) o(A) + (1/r)e'V(g) 


Letting « — 0 and A —@ completes the proof of Lemma 1, since ¢(4) — 0 
as A >, 

Our next lemma is a direct generalization (even to the proof) of the corre- 
sponding theorem for Fourier series due to Wiener (10, p. 221). It seems 
likely that it would be in the literature already, but, for lack of a reference, 
we include a proof. 


LemMA 2.} If h(t) is continuous and of bounded variation, 0 < t < 1, then 


*n 1 
(20) | =| f sin xt ani) dx = o(log n). 
1 ' 0 


Proof. Without loss of generality, we assume that h(0) = A(1) = 0. Other- 
wise, we could subtract from A(¢) an appropriate linear function, and such 
a function contributes to our = only 


ou fe 


re) ° | ‘ 
in ase , sin t} — (2/2) 
*C is finite, since f | ; (2/ di converges. 
1 
+The converse also holds (as is the case in Wiener’s theorem); that is, for A(#) of bounded 
variation, (20) implies that h(¢) is continuous. This is an immediate consequence of (1) and 
(4), once Lemma 1 is taken into account. 


| 


, sin xt dt| dx = O(1). 























th 


Ww 





THE LEBESGUE CONSTANTS 289 
This assumption made, integration by parts shows that 


n 1 n 1 
fie J sin xt ah) dx -f{ J (cos xt) h(t) ae 
1 0 1 0 


Defining h(t) to be zero for t > 1 and for t < 0, we have, from Parseval’s 
theorem, 


J 4[o +2) - aw} +... + [ie + ae =) Phe 


oo 1 2 
= {k, 2)} | 4 sin® {x/(2k)} | e*'h(t) dt| dx 
—x 0 
el 
0 


= 2k 2 j * 2k 
> pk | | e*"h(t) dt| dx > ry 
A k 
where » > 0, and the last inequality follows from that of Buniakowsky- 


Schwarz (4, p. 132). 


Hence, 


2k 1 
J J (cos st)h(t) dt dx 
“fale 8) a0 f+ ..-+ fave a 2=) a} 
<|» J Afals 2)-20 +...4+ h(t+1) — hlt+— (dt 


< pt2Ve(1/k)}, 














dx. 








| 








sl |" 2 
| e*"h(t) dt bax ‘ 








where w is the modulus of continuity, and V the total variation, of A(t). 
The lemma follows by using the above estimate for k = 1,2,4,8,..., 2”, 
. . . . ' 
where m = [log: nm], the largest integer in logs m, and adding, since w(1/k) —,0 


as k=, 


3. Proof of Theorem 1. Now let g(t) = A(t) + j(t), where A(t) is con- 
tinuous and 


j) = » [g(&+) — g(&—))], 


where the (possibly countably infinite) set {&} consists of all the points of 
discontinuity (jumps) of g(é). 
By Lemma 1, 


+1 


ni/2 
L(n; g) = (2/x) f j sin xt a(| dx + o(log n) 
1 90 


ani/2 sl 
+ O(1) | olf sin xt dh(t)| dx 
J 6 io | 


+ (2, n’)\g(1) — g(1—)|log n, 


and so, by Lemma 2, 











290 LEE LORCH AND DONALD J. NEWMAN 


el | 


(21) L(n;g) = (2/x) f ral sin xt di(t) | dx 


+ (2/m*)|g(1) — g(1—)|log m + o(log'n). 


Now, if 
T | 1 
“0 | 0 


then 


a 
~-_ 
~ 
_— 
Il 


TM) | p> (g(&+) — g(&—)] sin tax | t + o(T) 


T A(g) + o(T), 


say, while, 


ni/2 1 | eni/2 
f =| f sin xt dj(t) | dx = j T'F'(T) dT 
1 | 0 | J} 


= n*F(n*) — F(1) +f T F(T) dT 
1 


oni/2 ani/2 
O(1) + A(e) | TdT + o(1) f TdT 
1 1 


$A (g) log nm + o(log n). 

Finally, then, substituting in (21), we obtain the desired conclusions (1) 
and (2). 

The remaining conclusions (3), (4), (5), and (6) follow readily. (3) and 
(4) are obvious consequences of (2), since g(t) is a function of bounded varia- 
tion with g(0) = 0 and g(1) = 1 and _# (|sin x|) = 2/m. The “‘if” part of (5) 
is plain, since ordinary convergence is the Hausdorff method with g(t) = 0, 
0 <t< 1, g(1) = 1. For the “only if” part, we note that for C(g) to equal 
4/x? it is necessary that each term on the right of (2) be 2/2?, so that 
g(1) — g(1—) = 1, whence g(1—) = 0 (since g(1) = 1). The non-decreasing 
character of g(t) then implies that g(t) = 0, 0 < ¢ < 1, so that g(t) is the 
weight function for ordinary convergence. 

The first part of (6) is obvious from (2) and (3). The “‘if’’ portion of the 
second part has been established by Livingston (7). To establish the ‘‘only 
if’’ part we use the following lemma. 


Lemma 3. Jf a, > 0 and if at least two of the a,'s are positive, then 


(22) M| om a, sin &x \ < (2/7) p dy. 


Proof. We consider 








> a,\sin &x| — | > a sin &x 
k k 











THE LEBESGUE CONSTANTS 291 


By hypothesis, this cannot be identically zero. It is, however, non-negative 


and almost-periodic, and so has positive mean. Thus, taking means, 


>» a,(2/r) — Ma > a, sin ber| t > 0, 


and the lemma is verified. 


Putting a, = g(&+) — g(&—) now shows, in view of the above lemma and 
(2), that g(é&+) — g(é&—) can be different from zero for only one value of k 
if C(g) = 2/x?, that is, that g(t) can have at most one discontinuity. From 
(4) and the present assumption that g(1) = g(1—), we know that it must 
have at least one jump in 0 < ¢ < 1. Thus, g(¢) is the one-jump step function 
which defines a method of Euler type, and the proof of (6) is complete. 


4. Proof of Theorem 2. We show now that the error term in (1), o(log m), 
cannot be improved even for the class of increasing absolutely continuous 
weight functions. 

By Livingston’s formula (11) we have 





dx+o(1), 





ken-1/2 | al . 
L(n;g) > (2/7) f x | {1 — 4t(1—2) sin’x]*" sin 2nx t dg(t) 
9 0 


which in turn yields 


*1 


rni/2 
(23) L(n;g) > (2/n) | =| | sin xt dg(t) 





dx + O(1). 


This latter inequality follows from its predecessor since 
[1 — 44(1 — é)sin? x] = 1 + O(nx?) 


and 
hen-1/2 
x '(nx*) dx = O(1). 


We may assume, of course, that e(”) log nm ~ ©, since otherwise the theorem 
is trivial. This done, we proceed now with the construction of an ZL, function 
g(t) and a 6 > O for which 


ern/2 
(24) f x 
0 


Now let {a,} be a convex sequence such that a, — 0 and a, > [e(n)]* for 
all m. Then (10, p. 109; 11, p. 183), a9 + Za, cos nt is the Fourier series of 
some non-negative L; function, say p(t). Define g(t) = p{x(t — 4)} — pir 
(¢ + 4)}. Thus, g(t) is in Z; and has the Fourier series 


1 
fot) sin ata dx > be(n) log n, Se 
0 





2a, sin rt — 2a3 sin 3xt + 2assin 5rt — +.... 


Let 














292 LEE LORCH AND DONALD J. NEWMAN 


Bi 
M = | lg(t)| dt 


and recall that M > 2a, > 2[e(n)]! so that [e(m)]*/(2M) < x. 
Denote by J, the interval 


_ fe(n)}? 


(2k +1) OM 


<x < (2k+1) 24 


for 0 < 2k + 1 < n}. These intervals are disjoint and lie in (0, xn!). Through- 


out J, we have 





ne , — [e(n)}? 
isin xt — sin(2k + 1) at| < “OM? 
and so 
1 
if q(t) sin xt a 
0 
9 fe(n)? ¢° 
> Jaw sin {(2k + 1) zt} a ae | \g(t)| dt 


= |arsi| — 4{e(n)]? > 3fe(n)}?. 


Thus, J, having length }[e(n)]*/M, we obtain 


-1 
J 
Ik 





1 ! 
, a i 8 . 
Jaw sin xt dt dx > {Mn 2k+1° 


Hence 


xni/2 
—1 
x 
0 


In view of (23), this proves (24) with 6 = 1/(16Mz). 
Now, by the decomposition theorem, we can write 


dx > 4% I e(n) log n 


7 4M & 1cteen? 21? 16M 











el 
J q(t) sin xt dt 
0 


t 
f q(s) ds = cigi(t) — coge(t), 


where g;(¢), « = 1,2, are absolutely continuous, increasing and g:(0) = 


g:(1) = |, 


If gi(t) and g2(t) both satisfied the relation L(n; g) = o(e(m) log m), then, 


from (23), 





ani/2 
-1 
x 
0 


which contradicts (24). 





1 
f q(t) sin xt dt} dx = o(e(n) log n), 
0 


Thus, at least one of the pair g;(¢), g2(t) must meet our requirements and 


the proof is complete. 


Q, 





WwW 





THE LEBESGUE CONSTANTS 293 


5. Proof of Theorem 3. Here we revert to (11), which assumes now a 
simpler form since the weight function (7) is absolutely continuous. Dis- 
regarding the error term, which is 0(1), we denote the present case of the 
integral on the right of (11) by Z(m; C,), so that 





wir sl 
(25) L(n; Ca) = (2a/x) | "i J (1 — 44(1 — #)sin*x]™ 
0 0 


(1 — t)*~'sin 2nxt dt dx. 


The principle of stationary phase leads us to expect the chief contribution 
of the inner integrand to arise when the expression in brackets is virtually 
one. Accordingly, we set about replacing that expression by 1 and show 
that the resulting error is o(1). To this end, we disregard the factor (2a/7) 
and decompose the integral as follows: 


abe Bi i | 
26) | x* j (1 — 4¢(1 — #)sin*x]™ (1 — #)** sin 2nxt dt| dx 
“9 


f+ fl. 





where 8 is a constant, } < 8 < 1. 
We show first that the last integral is o(1). 
Denoting the absolute value of the inner integral by V,(x), we have 


he 
27) ) id «= Val) dx < {max V,(x)}(1 — 8B) log n, 


where the maximum is taken for n*-' < x < $x. 
Now, noting that sin x > (2/2)x for0 < x < 4m, and that here x is between 
n-' and 43, 
(1 — 4¢(1 — é)sin? x} < [1 — 169-%(1 — 2x2] 
< [1 — 164-%(1 — t)n**-*]" < exp { — 8x-%(1 — dn}, 


since (1 — &~')* increases to e~! as k becomes infinite. 


Thus, 


1 
0 < V,(x) < f exp{—82‘t(1 -- tn) (1 — t) dt 


i 
<2 j *exp{—82*t(1 — t)n*™—") dt 


“0 


4 f 
< 2 | f exp{—44 7 tn”) dt 


< 2 | f exp{—447 tn™"} dt 


= 21 (a)(4n)*(n*”)*. 














294 LEE LORCH AND DONALD J. NEWMAN 


In connection with (27), this establishes that 


ahr 
(28) J _© Vale) dx 


ohr : 
< | sz 
nf-1 


In the first integral on the right in (26) we replace the expression in brackets 
by 1 and show that the error committed is 0(1). To do so, we consider the 
difference D,(C,) of the two expressions: 





sl | 
J fl — 441 — t)sin*x}*"(1 — t)*" dt| dx = o(1). 


anh-1 Bi | 
(29) D,(C.) = | | j f,(t) sin 2nxt dt dx, 
0 wt 
where 
(30) fa(t) = {1 — [1 — 4t(1 — 2’) sin? x} — 


We note that f,(0) = f,(1—) = 0 and integrate by parts the inner integral 
in (29), obtaining 
enh-1 


D,(C.) = in | x 


0 





el 
| {(1 — a)(1 — t)"f(8) 


0 


+ (2n sin*x)[1 — 4¢(1 — t)sin*x]** ‘(1 — t)**(1 — 2t)} cos 2mxt dt| dx 


2nP- 1 an? 1 el 
< in ¢ ar et (1 — t)~f,(t) dt dx +f x‘sin'’s | (1 — t)*”* dt dx 
70 


= Das (Ca) + Da2(Ca) = Dai (Ca) ae o(1). 
Now, (4, p. 40 (2.15.3)) 


(31) 0<1-—[1l-—4t1-d sin? x]!" < 2ni(1 — 2) sin’ x, 
so that 
eni-1 al 

(32) Dyi(Ca) < j x ‘sin’ | (1 — t)*”* dt dx = o(1). 

“90 7 

Thus, D,(C.) = o(1) and we have 
enf-1 

(33) L(n;C.) = (2a, nfs a , (1 — #)*' sin 2nst di dx + o(1) 





(1 — t)* sin xt a dx + o(1) 





2nB 
(2a/n) f x 
0 
Fox 
(2a/x) | x 
0 


This completes the proof of (9), provided the infinite integral exists. That 
it does was shown in a few lines by Cramér (2, p. 10). Alternatively, this fact 





sl | 
j (1 — t)*"' sin xt dt| dx + o(1). 





ral 


dx 


dx 





THE LEBESGUE CONSTANTS 295 


can be established by the method employed in the next section to prove the 
convergence of the integral in (10). The remaining parts of the theorem are 
either incorporated in (10’), proved in § 6, or are obvious. 


Remark. Cramér based his proof of Theorem 3 on the equivalence of the 
Cesaro and Riesz means, rather than, as here, by regarding the (C, a) means 
as special Hausdorff methods. 


6. Proof of Theorem 4. The proof of (10) follows the same lines as the 
proof of (9), and, in fact, utilizes some of the same calculations. 

In analogy with the previous section, we define L(n; H,) to be the integral 
on the right of (11), thereby committing an error of o(1), with g(t) now 
given by (8). 

Thus, 





(34) L(s:H,) - —~ 


o ‘ie jn 
xT (a) ‘ 44(1 — t)sin“x]” 


| 


(—log t)*~'sin 2nxt dt| dx. 


As before, we consider first that portion of the integral from x = n*~' to 
x = $n, with } < 8 < 1. 

Since 0 iB <1, we have (— log ?#)*"'! < (1 — #)*"', 0 <t <1, so that 
the portion under consideration is less than 


(2/ pure f, = Vf [1 — 4¢(1 — #)sin (1 — t)*~"dt| dx, 


which, from (28), is o(1). 


Continuing, we define 


Da(Hz) = [Ce 


Integrating the inner integral by parts, this becomes 





1 
f {1 — [1 — 4¢(1 — t)sin*x]*") (—log t)*~'sin 2nxt dt dx. 
0 





nb) 1 
D, (Ha) = yi? f x? f ((1 — a) (t~*) (—log #)**{1 — [1 — 44(1 — t)sin*x]"} 
0 0 
| 
+ (2n sin*x) (—log ‘1 — 44(1 — t)sin?x]™ “(1 — 2t)} cos 2nxt dt) dx 
nB-1 1 
< yi f xf (1 — 2° {1 — [1 — 441 — asin’x]) dt dx 


nh-1 71 
+f x ‘sin’s | (1 — t)*~* dt dx. 
0 0 


The last term is D,2(C,) which has been shown to be o(1). Applying (31) 














296 LEE LORCH AND DONALD J. NEWMAN 


to the preceding term shows it to be less than the integral in (32), which 
is also o(1). Hence D,(H.) = o(1) and 


(35) L(n; Hz) 
onh-1 


el | 
= (2/r)[I, r(a)) | x | j (— log t)*~'sin 2mxt dt| dx + 0(1) 
je 0 | 


0 
®w | el 

= (2/n)[1, r(a)] | xl (— log t)* ' sin xt dt| dx + o(1), 
0 0 | 


provided the infinite integral converges. If so, this completes the proof of (10). 


That the integral converges is a consequence of Bromwich’s Theorem, once 
we observe that there is no singularity at x = 0 in (35). We use that form 
of Bromwich’s Theorem employed in (9, p. 230). For convenience we para- 
phrase}its statement: 


BRoMWICH’s THEOREM. Let f(t) be of bounded variation for t > 0. Then, for 
0<a<l, 


1 

(36) vf t*~"(t)0(xt) dt = f(0O+)T' (a)0(4ar) + o(1), 
0 

where 0(t) denotes either of the functions cost or sint. 


Applying this to the inner integral in (35) yields 


sl 


el 
“| (— log t)*~* sin xt dt = a | t*fa(t) sin {x(1 — t)} dt 
0 ) 


1 


el 
= x" sin x | t*~fa(t) cos xt dt — x" cos a) t*—f,(t) sin xt dt 


0 0 
= (sin x){T'(a)f.(0+) cos far + o(1)} — (cos x){T'(a)f.(0+-) sin fax + 0(1)} 
= I'(a)f.(0+) sin(x — far) + o(1) = I(a) sin (x — far) + o(1), 
where f(t) = [— f' log (1 — #)]*"'," so that £,(0+) = 1. 
Thus, the inner integral is O(x-*), making the integrand of the infinite 
integral O(x-*"'), establishing its convergence (in view of the regularity at 
x = 0, already pointed out). 


Remark 1. An even easier application of Bromwich’s Theorem establishes 


+1 
(37) “| (1 — #)*” sin xt dt = T'(a) sin (x — fax) + o(1). 
0 
This result demonstrates the convergence of the integral in (9). 
It also shows that the requirement of Bromwich’s Theorem that f(t) be of 
bounded variation cannot be relaxed to the slightly weaker assumption that f(t) 
be monotonic, positive, and integrable. For, could this be done, we would have 





or 


it 


if 
) 





THE LEBESGUE CONSTANTS 297 


1 el 
wf (1 — t)*~" sin xt dt = “| *(t-* — 1)*™ sin xt dt 
0 


I'(a) (sin }a7)f(0+) + 0(1) = o(1), 


since f(t) = (f' — 1)*"' and 0 < a@ < 1, and this contradicts (37). 
The same point can be made by considering similarly 


el 
| (—log t)*~* sin xt dt. 
0 


Remark 2. Information similar to Bromwich’s Theorem is collected (11, 
chapter v, § 2, and p. 379), where references to further literature are found. 


Reverting now to the proof of Theorem 4, we require a lemma in order to 
establish (10’) and (10’’): 


LEMMA 4. Given regular Hausdorff methods T, and T; with associated Lebesgue 
constants L,(n) and L2(n), respectively. Suppose that there is a ‘‘totally regular” 
Hausdorff method U such that Tz; = UT. Then, if lim L,(n) exists (n ~@) 
and equals L;, i = 1,2, we have Le < Ly. 


Proof. The matrix of U has exclusively non-negative elements, as shown 
(together with the converse) by Hurwitz (6, p. 243), so that 


DY (t) = SS ramD2 (0), ‘Yam > 0, 


m=0 


where D,‘(t) denotes the 7, transform of the Dirichlet kernel, 7 = 1, 2. 
Hence 


Di? (t)| < 2 Yam|DS (t)|, 


and, integrating, 


L2(n) < . Ynml-1(m). 


The conclusion now follows from the regularity of the method U and the 
existence of the limits Z; and Lo. 

The lemma established, the proofs of (10’) and (10’) are immediate. For 
(10’), we identify 7, with (H,a) and 7; with (C,a), and note that Basu 
(1, pp. 453-454) has shown that the corresponding U is totally regular. 

As to (10’’): first let 7; be (C,a) and 7: be (C, 8). The corresponding U 
is totally regular (6, p. 245) and so L(Cs) < L(C,) for 0 < a < B. Next, let 
T, be (H, a) and T: be (H, 8). Again, the corresponding U is totally regular 
(1, pp. 459-460) so that L(Hs) < L(H,). Finally, we note that L(C,) and 
L(H.) each exceed the integrals obtained, respectively, by deleting the 
absolute value signs about the respective inner integrals, and that the resulting 
values are both 1. 








298 LEE LORCH AND DONALD J. NEWMAN 


The remaining assertions of Theorem 3 are obvious. 


Remark, That the Gibbs phenomenon is found for (H,a) for at least as 
large a as for (C, a) also follows from the positivity of the matrix (H, a)/(C, a), 
since this implies that the oscillation of the (C, a) means cannot exceed that 
of the (H, a) means (3, p. 52, Theorem 9). 


REFERENCES 


1. S. K. Basu, On the total relative strength of the Hilder and Cesdro methods, Proc. London 
Math. Soc. (2), 50 (1949), 447-462. 
2. Harald Cramér, Etudes sur la sommation des séries de Fourier, Ark. f. Mat., Astr., och 
Fys., 13 (1918) n:o 20, 1-21. 
. G. H. Hardy, Divergent series (Oxford, 1949). 
. G. H. Hardy, J. E. Littlewood, and G. Pélya, Inequalities (Cambridge, 1934). 
5. Einar Hille and J. D. Tamarkin, On the summability of Fourier series. IIT, Math. Ann., 
108 (1933), 525-577. 
6. W. A. Hurwitz, Some properties of methods of evaluation of divergent sequences, Proc. Lon- 
don Math. Soc. (2), 26 (1927), 231-248. 
7. Arthur E. Livingston, The Lebesgue constants for Euler (E,p) summation of Fourier series, 
Duke Math. J., 21 (1954), 309-314. 
. Otto Sz4sz, Gibbs’ phenomenon for Hausdorff means, Trans. Amer. Math. Soc., 69 (1950), 
440-456. 
9. G. N. Watson, A treatise on the theory of Bessel functions (2nd ed.; Cambridge, 1944). 
10. A. Zygmund, Trigonometrical series (Warsaw, 1935). 
11. — - Trigonometric series (2nd ed.; Cambridge, 1959), I. 


> & 


oo 





University of Alberta 
and 
Yeshiva University 








lon 


on- 








SOME PROBLEMS FOR TYPICALLY REAL FUNCTIONS 
JAMES A. JENKINS 


1. Many extremal properties of the class of normalized univalent functions 
are shared by the class of typically real functions each considered in the unit 
circle. By the class 7 of typically real functions we mean those functions f(z), 
regular for |z} < 1 with f(0) = 0, f’(0) = 1, and such that Qf(z) > 0 for 
Sz > 0, Xf(z) < 0 for Jz < 0. This class was first studied by Rogosinski (4) 
who proved various simple properties for it. Later Robertson (3) took up 
the study and proved the following important representation result. If f(z) © T 
there exists a function a(@), increasing for 0 < 6 < r with a(0) = 0, a(r) = 1 
such that 


f(z) = J z(1 — 2 cos 6s + 2°)~*da(@). 


Other authors have treated further problems for the class 7 but none of 
these problems seems to belong to the class we would characterize as con- 
ditional extremal problems. For the class S of normalized univalent functions 
in the unit circle the best known such problem is of course the problem of 
Gronwall (1) to determine the maxima and the minima of |f(z)| and |f’(z) 


for |z|} = r when the value of |f’(0)| is assigned. In the present paper we 
will solve analogous problems for the class 7. For the minimum problems an 
essential role in the solution is played by the Neyman-Pearson Lemma, 
important in statistical theory. For the maximum problems a different but 
similar lemma is applied. The author wishes to express his thanks to Ky Fan 
and Irving Glicksberg for calling his attention to the Neyman—Pearson 
Lemma and to Seymour Sherman for a useful conversation on the relationship 


between the mathematical and statistical aspects of the lemma. 


2. Because of the difficulty of giving a reference to the Neyman—Pearson 
Lemma in a simple purely mathematical form we will give here a simple proof 
in the special case which is sufficient for our purposes. 


Lemma 1. Let g(x), h(x) be non-negative measurable functions defined on the 
interval [a, 6] such that g(x)/h(x) is continuous and strictly increasing. Let 


b 
O<k <f g(x)dx. 
Then the extremal problem given by 


Received March 21, 1960. Research supported in part by the National Science Foundation. 


299 








300 JAMES A. JENKINS 


Jee )dx = & 


fsceyn (x)dx = minimum 


for functions f(x) measurable on |a, b| and satisfying 0 < f(x) < 1 has as a 
solution the function f* (x) defined by 


(1) f*(x) = 0 agqx<e 
f*(x) = 1 c<x<db 
where c is determined by 
*D 
(2) | g(x)dx = k. 


If f(x) is further required to be increasing this solution is unique apart from its 
definition at the value x = c. 


It is clear that the value c is uniquely determined by (2). Then let f* (x) be 
the function defined by (1) and f(x) any other admissible function for the 
extremal problem. Then if g(c)/h(c) = A we have 


f (f — f*) (Ah — g)dx > 0 


since on [a, c) 


f(x) — f*(x) > 0, Ah(x) — g(x) > 0 
and on [c, d] 


f(x) — f*(x) < 0, Ah(x) — g(x) < O 
Since 


and \ > 0, unless c = a in which case the result is evident, we find 


J fhdx > "ft hd 


as stated. Evidently the solution is areal determined up to a set of measure 
zero. When f(x) is required to be increasing the solution of the extremal 
problem is thus uniquely determined apart from its value at the point x = c. 


THEOREM 1. Let f(z) © T and have the expansion about the origin 
f(z) =zg+Ae2?*+.... 


It is well known that 











ts 


le 





TYPICALLY REAL FUNCTIONS 301 


For a fixed value uw, — 2 < uw < 2, if 


Ao =u 
then forO<r<l 
(3) f(r) > rl — wr +r)" 
(4) fy >Ud—-r)(l — wr +r*). 


Equality is attained only for the function 2(1 — uz + 2*)~"'. 


Using for f(z) the representation 
f(z) = f 2(1 — 2 cos 6s + 2") 'da(@) 
0 


where a(@) is increasing for 0 < @ < x with a(0) = 0, a(r) = 1 we have 


(5) A; -{ 2 cos 6da(@), 
0 

(6) f(r) = | r(1 — 2cos6r +r’) da(6), 

(7) f'(r) -f (1 — r*)(1 — 2cos Or +r’) da(@). 
0 


Integrating each of these equalities by parts (compare (5; Theorem 4b)) we 
have 
ar 


(8) Ae = 2[—a(r) — a(O)] + 2 | a(@) sin@d@ = —-2+2 j a(6) sin 6 dé, 
0 “0 
(9) f(r) =r(L+2r4+r*) "+ { 2r°a(8) sin 0(1 — 2r cos @ + r°)~* dé, 
vo 


(10) f(r) = (Q—r’*)\(l+2r4+7*)? + j 4r(1 — r°)a(@) sin 0 


(1 — 2r cos @ + r°)~* dé. 
We now apply Lemma 1 with a, db, g(x), h(x), k replaced by 0, 7, sin 8, 
sin 6(1 — 2r cos 6 + r*)~?, 4(u + 2). Thus we find f(r) will be minimal when 
a(@) is defined by 
a(6) = 0, 0<0<@ 
a (6) l, coger 


and @* is such that 

2 cos 6&* = u. 
From this the inequality (3) is immediate. The equality statement follows at 
once from Lemma 1. Alternatively repiacing 4(x) by sin @(1 — 2r cos 6 + r*)-* 
we obtain the inequality (4) and the corresponding equality statement. 








302 JAMES A. JENKINS 


3. If we consider instead of the minimum problem of Theorem 1 the corre- 
sponding maximum problem the Neyman—Pearson Lemma no longer provides 
a solution since the result corresponding to Lemma 1 would lead to a decreasing 
rather than an increasing function a(@). However we can use instead the fol- 


lowing lemma which is more special and essentially confined to the present 
situation. 


LemMA 2. Let g(x), h(x) be non-negative measurable functions defined on the 
interval [a,b] such that g(x)/h(x) is continuous and strictly increasing. Let 


b 
O<k <f g(x) dx. 


Then the extremal problem given by 
*> 


f(x)g(x)dx 


II 
= 


b 
f ff (x)h(x)dx = maximum 
a 


for functions f(x) increasing on |{a, 6] and satisfying 0 < f(x) <1 has as a 
solution the function f*(x) defined by 


7d 
f*(x) = 2 / j g(x)dx, a<gx<b. 
This solution is uniquely determined apart from its values at a and b. 


Let f(x) be any admissible function for the extremal problem. There will 
exist a value c, a < c < b such that 


*d 

{x< 2 / | g(x)dx, agx<c 
ed 

f(x) > 2 / f g(x)dx, c<2z <b. 


Then if g(c)/h(c) = \ we have 
b 
f (f — f*) (Ah — g)dx < 0, 
since on [a, c) 


f(x) — f*(x) < 0, Ah(x) — g(x) > 0 


and on (c, 5] 


f(x) — f*(x) > 0, h(x) — g(x) < 0. 


db db 
JS fea =f peas =i 


Since 














and 








TYPICALLY REAL FUNCTIONS 303 
and \ > 0, unless ¢c = a in which case the result is evident, we find 


fi hdx < fire dx 


as stated. Since the solution is uniquely determined up to a set of measure 
zero the equality statement is immediate from the requirement that f(x) be 
increasing. 


THEOREM 2. Let f(z) € T and have the expansion about the origin 
f(z) =~2+Aoz?+.... 


For a fixed value uy, — 2 < wu < 2, of 


1, = 4p 
then forO0 <r<l 
(11) f(r) < Ge + 4)r — 1)? 4+ G - Bw) (1 +9"? 


(12) fi) < Get) 4+n0 -—n 7? 4+ G6 -Ww-—-ynNU +r). 
Equality is attained only for the function 
(Au + 3)2(1 — z)-? + (4 — dw)2(1 + 2)-?. 


As in the proof of Theorem 1, A2, f(r), f’(r) are represented by the equations 
(5), (6), (7) or by the partially integrated expressions (8), (9), (10). We apply 
this time Lemma 2 first with a, }b, g(x), h(x), k replaced by 0, z, sin 8@, 
sin 0(1 — 2r cos @ + r?)~*, 4(u + 2). Then we find that f(r) is maximal for 
a(@) defined by 

a(0) = tn + 3, Q0<60< ft. 


By the normalization imposed on a(@) we have a(0) = 0, a(x) = 1. Inserting 
this function in (6) we obtain the bound (11) and in the Stieltjes integral 
representation of f(z) we find that the unique extremal function is 


(du + 4)2(1 — z)-? + (3 — 4u)2(1 + 2)-?. 


Alternatively replacing h(x) by sin 6(1 — 2r cos @ + r*)—* we obtain the bound 
(12) and find that the unique extremal function is again the one just given. 

Numerous other problems can be solved by the methods presented here. 
Moreover, Lemma 2 has extensions which allow further applications. In the 
present cases it is interesting to note that the lower bounds given by (3) and 
(4) coincide with those obtained in Gronwall’s problem while the upper bounds 
(11) and (12) occur for a function which is not univalent, compare (2). 








304 JAMES A. JENKINS 


REFERENCES 


1. T. H. Gronwall, On the distortion in conformal mapping when the second coefficient in the 
mapping function has an assigned value, Proc. Nat. Acad. Sci. U.S.A., 6 (1920), 300-302. 

. James A. Jenkins, On a problem of Gronwall, Ann. Math., 59 (1954), 490-504. 

M. S. Robertson, On the coefficients of a typically-real function, Bull. Amer. Math. Soc., 41 

(1935), 565-572. 

. W. Rogosinski, Ueber positive harmonische Entwicklungen und typisch-reelle Potenzreihe, 
Math. Zeit., 35 (1932), 93-121. 

5. D. V. Widder, The Laplace transform (Princeton University Press, 1946). 


wn 


— 


Washington University 
St. Louis 








n the 


rethe, 





PROPERTIES OF THE COEFFICIENTS OF 
ORTHONORMAL SEQUENCES 


P. S. BULLEN 


1. Introduction. In this paper we consider complete orthonormal sequences 
defined on the interval [0, 1] and satisfying an inequality of the type 


1 l/r 
(f a) < F,, 2<q v< o, 
0 


sup | dn| < F y= @, 
0<x<l 


(1) J bn) 


for all m and some sequence {F,}. Such sequences were first considered by 
Zygmund and Marcinkiewicz (8). They extended the weli-known results of 
Hausdorff—Young and Paley, originally proved for the case »v = ©, F, = M for 
all 2 (12). We will consider cases of equality in the Hausdorff-Young theorems 
and certain limiting cases of the Paley theorems. Application of these results 
and the results in (8) will be made to functions harmonic in the unit a-sphere. 


2. If p > 1 then p’ will denote the conjugate index, 1/p + 1/p’ = 1. 
If cy, 2,..., are the Fourier coefficients of a function in L,, with respect 
to {¢,} satisfying (1) define 


(2) iienahrrr'r™., s = 1,3,..., 


« \ ir [r # : \ ir 
(3) U,(c) = U,(d) = >> ld,(s)\"¢ = >> lel” Fe? 2 . 


n=1 
' — 
= max |d,(v’)| = max (|q.|F, ),7 = @, 
n n 

where 7, s are related by 

’ 9) —_ ’ 

v 2-y 
(4) i ee oe =1. 

s r 


The Fourier coefficients are replaced in this general situation by the sequence 
{d,}. For instance, the following extension of Mercer’s theorem can be proved 
along the lines of the original theorem (6, p. 155). 


TuHeoreM 1. If f € Ly then d,(v’) = o(1). 


Received December 23, 1959. 


305 











306 P. S. BULLEN 


3. Cases of equality. 


3.1. The cases of equality in the Hausdorff-Young theorems were first 
discussed by Hardy and Littlewood (4) for the trigonometric case. Their results 
were extended to the case » = @, F, = M for all n, by Verblunsky (10) and 
Calder6n and Zygmund (1). We will use the methods of the last authors to 
prove the general Hausdorff—Young result and then to discuss equality. 


3.2. THEOREM 2. (a) If f € Ly, »' < p < 2, with Fourier coefficients cy, c2,... 
with respect to {¢,} then 
(5) U,(d) < J,(f) 
where 


(b) If for a sequence {c,}, U,(d) < ~,1 < p < 2, then there exists a function 
f € L, such that c, is the Fourier coefficient of f with respect to d,,n = 1,2,..., 
and 


(6) J,(f) < U,(d), 
where 
vy  2—y' 
—-_-—— = ] 
q p 


It is known that (a) implies (b) by a conjugacy argument and that it is 
sufficient to prove (a) under the assumption that {¢,} has N terms, f is a 
simple function with J,(f) = 1, (1). 

Let {a,} be a sequence such that 


N 
@(d) = +m Cun Fa < 
Define {A,} and F(t) by 
a, = A,!*Ft”",,, i, >0, len] = 1, 
f®=F"O)n®, FeH>0, || =1. 
Putting 1/p = zin U,(d) it becomes 
N / - . es ; : { 1 ) 
(7) ®(z) = > Ao v’ (1—2)) /(2—9 Fe /(2-r ates f F*ngadt ¢ 2 
n=l 0 j 


a function continuous and bounded in every strip, x; < x < xo, of finite width. 
It is not difficult to show that 


N sl 
(8) F,>1, >> A, = 1, j F(t)dt = 1. 
n=l 0 


Hence, by simple applications of Hélder’s inequality and Bessel’s inequality 
it can be shown that neither |@(1/»’ + zy)|, nor |@(1/2 + iy)| exceeds 1. This, 








an 





COEFFICIENTS OF ORTHONORMAL SEQUENCES 307 


by the Phragmén-Lindeléf theorem, implies |@(z)| < 1 in the whole strip 
1/2 < x < 1/»’, which proves the theorem. 


3.3. We are now in a position to discuss cases of equality in Theorem 2, 
excluding the trivial cases f = 0, p.p., and c, = 0 for all n. 

We can deduce (7) with no restrictions except J,(f) = 1 and again (z) 
is continuous and bounded, 4 < x < 1/»’, and regular, 4 < x < 1/»’, and (8) 
holds (with N = © of course.) 


THEOREM 3. (a) A necessary condition for equality in (5) is that 
N 


(9) f(x) = ya Can Gne(X), Nm, <mM2<... < My. 


For such functions we have equality if and only if 
. 1. im8 . 
(i) len, | Fa = 2, 
independenf of k, 


(ii) f is constant in a set of measure 


N -1 
(x F:,) and f=0Oin@E. 


(b) A necessary condition for equality in (6) is that only a finite number of 
C, differ from zero, and satisfy (i). The function is then of form (9) and we have 
equality if and only if it satisfies (ii). 


A conjugacy argument shows that (a) implies (b). Let us assume then that 
(1/p) = 1, that is, that we have equality in (5). Then the Phragmén-Lin- 
deléf theorem implies that @(z) = 1 for 4 < x < 1/»’. In particular 6(1/»’) = 
1. Further, (8) implies that 

; , 


| el : | 
Fees [FY nddty | <1. 


Hence for all 2 for which A, ¥ 0, 


’ 1 
(10) Frey f F" 9a,at = 1, 
0 


But F'’” € Ly, and so by Theorem 1 the left-hand side of (10) is 0(1). There- 
fore there is at most a finite number of non-zero A,, which proves (9). 
From (10) we also get that 


1 
J. ii lons| (sign f) (sign Gundu)dt = Fy, > 0, 
which implies two important facts about the set E where f is non-zero. 


(a) sign f = sign (Endn), p.p.inE,k =1,...,N. 
(b) F(x) = {Fnilda,(x)|}" p.p. in E,k = 1,2,...,N. 











308 P. S. BULLEN 
Hence, 
lmal = J U/l lal dt = Fae f ipl FO at, 
E gE 
which proves (i). Also 
N 
If(x)| = F’’"’ (x) > nal Fs, 


Let r be any number, »’ < r < 2, then the initial remark of the proof 
implies 


Jf), 


N 7. —Il/s . l/r 
(x F:,) = (| Fa) , 
k=1 EF 


Applying this equality for r = v’, 2, and p, where p is any value between 2 
and «, to the Hélder inequality 


» . (p—v’)/(2—r’) . : (2—p) /(2—r’) 
f Pra < (| Fa) (| F’ rat) 
E E E 


we see that it reduces to equality. Hence F, and so f, is constant p.p. in E. 
This proves the necessity, the sufficiency is immediate. 


mae 
Me 
= 
ee} 
i 


4. Star theorems. Given a sequence {c,} such that c, = o(1) the sequence 
}c*,} denotes {|c,|} arranged in descending order. 
The proof (8) of the extension of Paley’s theorems requires F, to satisfy 


(11) Fic ki < Fa <... 


or, at least, that for some a > 1 and all i, j,i < j, 


(11)’ max F,<K min /f,. 
at+l1<at*! ai+1<gai+! 
Whether this is essential is not known. If /, = M for all m the order of the 


sequence ¢, is immaterial and the Paley theorems can be improved to the 
Paley star theorems (8). However, because of (11) (or (11)’), no such simple 
argument is possible in general. We conjecture the following star theorem. 
It would follow immediately from the unstarred result if (11)’ could be dropped. 
Let, d = d, and define 


(12) V,(d) = V, 


« yur 
>> \d,(r)|" 0 waeinas .y>2.1 <v’<creve @ 


n=l 


max {|d,(@)|n},r =v = o. 
mn 




















COEFFICIENTS OF ORTHONORMAL SEQUENCES 309 


THEOREM 4. (a) Let d, = 0(1) be such that, 2 < q < v, V,(d*) < @. 
Then there exists an f © L, such that 


wv /(2—9")((2" /¢)—1) 
= dF, 


(13) Jaf) < AgeVa(d"). 


(b) If f € Ly, v <p < 2, has Fourier coefficients c, co... , with respect to 
on} then 


(14) V(8") < 4,..J,(f). 


These theorems were first mentioned in a paper by Littlewood (7), and the 
following comments are of some interest. 

(i) The hypothesis of (a) implies the existence of an f € Lz with the re- 
quired Fourier coefficients. 

(ii) The hypothesis of (b) implies, by Theorem 2, that d, = 0(1), and hence 
that starring is possible. 

(iii) By a conjugacy argument (a) implies (b). 

(iv) In §5 the casesg = vy = ©, p = »’ = 1 are shown to hold in a modified 
form. 

(v) In § 6 Theorem 4 is used to prove a known result. 


(vi) A similar argument to that in Zygmund (12) shows that Theorem 4 


implies Theorem 2 although in a slightly less precise form. 


(vii) If d, takes only the values 0, 1, —1, (a) is true. Because, let V < @ 
be the number of non-zero terms, then by Theorem 2 


JED) < (Up @)\ = N < Ky, Dn” = K, Via"). 


(viii) Similarly (b) is true if f is a function such that d, takes only the values 
0,1, —1. 
(ix) Finally we have the following weaker result. 


THEOREM 5. (a) Let d, = o(1) be such that foran « > 0,2 <q < v, V,(n*‘d*,) 
< «o,. Then Theorem 4 (a) holds with (13) replaced by 


J Af) € Ag.c.ceVq(08°d*,). 
(b) With the hypothesis of Theorem 4(b) we have 
V,(n-*d*,) < Ap.»,Jp(f), for all « > 0. 


As the usual conjugacy argument shows that (a) implies (b) it is sufficient 
to prove (a). By Theorem 2: 


J.(f) < U,(d) = U,(d*) < Agv.eVe(n'd*). 











310 P. S. BULLEN 


5. Some limiting cases of Paley’s theorem. It is known that the Paley 
results are not valid for the extreme values of p and g, that is, p = »’, g = ». 
Zygmund, (11), has extended the results to these cases for uniformly bounded 
{¢,} by slightly modifying the hypotheses and conclusions. 

Let us, for convenience, number the orthonormal sequence ¢z2, ¢3,... , and 
also let » = ~. By f € L,,, we shall mean that |f|? (logt|f])* € L. We place 
no restriction on the sequence | F,,} and so the star theorems follow immediately 
from the unstarred results. The proofs follow Zygmund’s closely enough for 
them to be omitted here. 


THEOREM 6. Let {d,} be any sequence satisfying 
d,’ < n—(log n)*", a <0, n = 2,3, 
where {d,'} is some ordering of {\d,|}. Then c, = d, F,~' is the coefficient with 


respect to o, of a function such that for \ > 0, small enough, 


vl 
j exp{Alf|"} dx < A, 


C 


THEOREM 7. If f € Li, a > 0, and if {c,} are the Fourier coefficients of f 
with respect to {o,} and if d, = d, (1) then 


ee) 1 

(a) > n ‘(log n)*"d*, < A f If|(log* |f|)* dx + B = C, 
n=2 0 

(b) > exp(—kd*, “") < @, for every k > 0, 
n=2 

(c) if in additiona <1, >> n“‘d*,'* < K.C”. 

THEOREM 8. If {d,} be amy sequence such that 
Di |da| (log 1/|dn|)* < ©, a > 0, 


then c, = d,F,,—' ts the Fourier coefficient with respect to o, of a function f such 
that exp (Ri f|“) € L, for every k > 0. 


THEOREM 9. Jf {d,} is such that d, = o(1) and 


@ 
r—1 r 
> n”d*i < @, 


n=2 


r > 1, then c, = d, F,~' ts the Fourier coefficient of a function f such that exp 


(RIf|") € L, for all kk > 0. 
5.2. The following theorem generalizes results due to Verblunsky, (10). 


THEOREM 10. 


(a) Ife = 





1 
: + = l,andp <r <q then 




















COEFFICIENTS OF ORTHONOKMAL SEQUENCES 311 


a (ff. f"'x ax)" < A,,,U,(c), 


l/r 
(ii) (¥ fa Ica r n~” er) /(2—’) oo ) or) < Ay.»Jp(f). 


n= 


2 = »p’ 





1 
(b) ap02= l,p<r<qthen 


aD l/r 
(ii) Jf) < Aw. (¥ lel’ ao" 2 4 r—(2urr’ ~~) : 


[f*(x) is a non-increasing rearrangement of |f(x)|, (8).] 

Extreme values of r give known theorems. For instance if, in (a), r = p then 
(i) reduces to the integral analogue of Theorem 4 (b), and (ii) becomes the 
unstarred form of Theorem 4 (b). If r = g then (i) and (ii) of (a) reduce to 
parts (a) and (b) of Theorem 2 respectively. 

The proof of (a) is by an application of Hélder’s inequality using these 
extreme forms. 

(b) follows by a similar argument or by a conjugacy argument from (a). 

5.3. Further extensions of Paley’s theorems are obtainable by integrating 


with respect to g, or by multiplying through by a function K(g) and in- 
tegrating, (9). For example, integration of the unstarred form of (13) gives 


((iift a) 


log |f| 








C-) (Jen| |F: */(2—9r’) n@-r"ye _ (\c,| Fz’? *")u (2—»’ )y2 1/¢ 
< AAS Pe 


a | /—*' 2/0 —~ log (Ini FY eat eats 
5.4. The Paley theorems were originally proved for the trigonometric 
system by Hardy and Littlewood (4), where they arose out of the following 
problem. If f € LZ, and 
f~ } Cn Pn; 


for what value of Y and X does 
—X} iY 
) n |Cn| 
converge? Using the above results we can solve this problem in the case of an 
orthonormal sequence satisfying 
JA dn) < K n*,a > 0. 
THEOREM 11. Jf f € L,, r > 1, and (v'/p) + (2 — v'/q), then the series 


= , 
> n*\c,\” 


n=1 











312 P. S. BULLEN 


is convergent if 
(i) r>2,¥Y>2,X >0, 


(i) r>2,¥<2,xX>1-2, 


(ii) r= p<2,¥>4,X > alg — 2), 


(iv) r= p<%p<V<aX> (1 +a¥) —F (1 +20) 


r 


(W) r= p<2,0<V<p,X> (1 +a¥) ~~ (1 +20) 


and, in general, it is not necessarily convergent in any other case. 


The proof follows that of Hardy and Littlewood exactly. 


6. Applications. 


6.1. Let f(P) be an integrable function defined on the surface, S, of the unit 
a-sphere, a > 1. Any such function can be expanded in terms of the ortho- 
normal sequence of ultraspherical polynomials { V,@(P)} having the property 


(15) |Vir(P)| < Kan“. 
lf f(P) ~ DL aV,@(P), then we define 


fo) 


(16) f,P)= DL aVe(Pyr, O<r<1; 
= 
f(r, P) is the function harmonic in the unit a-sphere with f(P) as boundary 
function. Series (16) can be summed to the Poisson integral taken over the 
surface, E, of the a-sphere of radius r. Using this representation du Plessis, 
(3), has proved a radial extension of the Fejér—Riesz theorem. It is known, 
(4), that when a = 2 the Fejér—Riesz theorem can be deduced from the 
Paley theorems. We will show that this is so in general. 
For reference we note that for orthonormal sequences satisfying (15) 


1. 1. ((a/2)—1)(1—(2/¢)) 
d,(q) = |c,|\" 


_ l/r 
T " (a/2)—1)(2— 
vie =(¥ len|’ no , ») l<r<o, 
n=1 
1—(a/2 
= max (|c,| 2”), r= © 


~ l/r 
V,(d) = (x cal” n°) l<r<o, 
1 











)- 











COEFFICIENTS OF ORTHONORMAL SEQUENCES 313 
THEOREM 12. If f(r, P) is subharmonic in the unit a-sphere and 


(17) J ser, Pyar <G p>I1,r<1, 
E 


J (1 — r)* f(r, P) "dr < KyeC. 


It is known that it is sufficient to prove this for f harmonic’and arbitrary 
but near to 1. Then it is an immediate consequence of the following lemma. 


LemoMa. Let f(r, P) be given by (16), and define 


7 


(18) F(r) = DO |ca| 2? 9. 


If 1 < p < 2 and (17) holds then 


f (1 — r)*°F*(r)dr < Ky,eC. 


This lemma, an extension of one in (4), is stronger than Theorem 12 when 
1 < p < 2, but is false if p > 2. 
By Theorem 2 with »y = g = © we have 


eal w“* * <K, f ifr, P)| AP, 
and hence, from (17), 
a <ccer, 
which gives 
F(e") < KC. 


Therefore, using Lemma 36 of (4) and the unstarred form of (14), we have 


sl el 
| (1 — r)*°F*(r)dr < KypeC + | _@= r)**F*(r)dr 


0 


< Ky.eC +K | n (1 _ ei) ye 8 p99 


n=l 


es) co v 
<K,.C+K > (¥ calm's-tycnm) 
n=1 m=1 
< Ky.C + K V,(d) < K,,.C. 
It is known that Theorem 12 is false if = 1 but the following result can 
be proved. 











314 P. S. BULLEN 
THEOREM 13. Let f(r, P) be subharmonic in the unit a-sphere. If, p > 1, 
(19) J ve. P)|(log* |f(r, P)|)'’ dP < C,r <1, 
then 
fa — ry?" f(r, P) |’ dr < AC+B. 


This follows from the lemma, 


LemMA. Let f(r, P) be given by (16) and F(r) by (18) then if (19) holds 


1 l/p 
(f (1 — ny prG)ar) <AC+B. 
0 


The proof of this lemma is similar to the above proof using Theorem 7 (c) 


in place of the unstarred form of (14). The case p = 1 of this theorem has been 

proved by du Plessis, (3), who considers diametral as well as radial theorems. 
6.2. Iff(P) ~ > aVe(P) 

then 


fP)~ dD ne, Ve" (P) 
is called the 8th integral of f. Ifa = 2, then Hardy and Littlewood, (5), proved 
that if f € L, then fg € L, where 8 = 1/p — 1/q. This result has been ex- 
tended by du Plessis, (2), to general a. Zygmund (12) has shown that, 
in the case a = 2, the result follows from the Paley star theorems provided 
b < 2<q. We will show that this is the case in general, assuming the truth 
of Theorem 4. 


THeoreM 14. If f€ L, p>1,0< 8 < (a —1)/p, then fp c L, where gq 
is given by B = (a — 1) (1/p — 1/q). Further 


(f war) <Ken(f rar)” 


We may assume that the right-hand integral has value 1. From Theorem 2, 


l/¢ 
(f vat ap) < Ky,aU,(d) < Ky,2U,(a*) 


r * 1/p’) 1—(p/a’) t7(v/a’) 
< K,,2max(d,n” )~""’V,"""’(d*), 
° , , . , 
since p’ > q’, and provided q’ > p. 
Since d*,?n?—? decreases monotonically we have 
. 1/p’ 
d,n’” = 0(1) 


with bound not exceeding V,?(d*). 
Hence if g’ > p, 




















COEFFICIENTS OF ORTHONORMAL SEQUENCES 315 


i/¢ 
(fal aP) < Kpgn VEE) < Kye 


by Theorem 4. The completion to all p, g, p < 2 < q, follows as in Zygmund, 
(12). 

It is known that Theorem 14 is false if p = 1 but a modified theorem can 
be proved, again subject to g > 2, although the result is probably true without 
this restriction. 


THEOREM 15. (i) Jf f € Liwvo, 7 > 1, then fp € Ly where B is given by B = 
(a — 1)/q’. Moreover 


(fat iP) <A J. If| (log*|f|)'* dP + B. 


(ii) ff € Lin, thenfs € L, where gq is given by 8 = (a — 1)/q', and moreover 


1/¢@ 
(f fal" ap) <A f If| (log*|f|) dP + B, 
8s s 


This is a generalization of a result due to Zygmund (11) although his proof 
is different. We deduce it from Theorem 7 (c) and the unstarred Theorem 4. 
Let d, = d,(1). then these two results imply that 


l/¢ on l/¢ 
(f elt dP) < Kil 2,|* n~) <A f\(log* |f|)"" dP + B. 
s n=1 Js 


which is (i). In a similar manner (ii) follows, but is also a consequence of (i) 
since f € L,, implies f € Ly,:v,) for all g > 1. 


REFERENCES 

1. A. P. Calder6én and A. Zygmund, On the theorem of Hausdorff- Young and its extensions, 
Ann. Math. Studies No. 25. 

2. N. du Plessis, Some theorems about the Riesz fractional integral, Trans. Amer. Math. Soc., 
80 (1955), 124-134. 

3. ——— Spherical Fejér—Riesz theorems, J. London Math. Soc., 31 (1956), 386-91. 

4. G. H. Hardy and J. E. Littlewood, Some new properties of Fourier constants, Math. Annalen, 
97 (1926), 159-209. 

5. ———— Some properties of fractional integrals II, Math. Zeitschrift, 34 (1931), 403-439. 

6. S. Kacmarz and H. Steinhaus, Theorie der Orthogonalreihen (Chelsea, 1951). 

7. J. E. Littlewood, On a theorem of Paley, J. London Math. Soc., 29 (1954), 387-395 

8. J. Marcinkiewicz and A. Zygmund, Some theorems on orthogonal systems, Fund. Math., 
28 (1957), 309-335. 

9. H. P. Mullholland, Concerning the generalization of the Young-Hausdorff theorem, Proc. 
London Math. Soc., 35 (1933), 257-293. 

10. S. Verblunsky, Fourier constants and Lebesgue classes, Proc. London Math. Soc., 24 (1935), 
1-31. 

11. A. Zygmund, Some points in the theory of trigonometric series and power series, Trans, 

Amer. Math. Soc., 36 (1934), 586-617. 
12, —— Trigonometric series (2nd ed.; Cambridge, 1959), I, Il 


University of British Columbia 











ON A CLASS OF SINGULAR DIFFERENTIAL 
OPERATORS 


R. R. D. KEMP 


In the considerable literature on linear operators in Le or L, arising from 
ordinary differential operators it has always been assumed that the coefficient 
of the highest order derivative appearing does not vanish in the interior of 
the interval under consideration. If this coefficient vanishes at one or both 
endpoints of the interval, or if one or both of the endpoints is infinite the 
differential operator is said to be singular. In this paper we shall allow this 
leading coefficient to vanish in the interior of the interval, and show that 
the theory of such operators can sometimes be reduced to a consideration 
of several operators of the well-known type. We shall also indicate how those 
which cannot be so reduced should be dealt with. 

A cursory examination of the problem leads one to the conclusion that the 
major change from the known situation will be in the definition of appro- 
priate domains for such operators, and thus in the construction of appropriate 
boundary conditions. 

Thus in § 1 we shall define the domains of basic minimal and maximal 
operators associated with a given differential expression, and thus define a 
class of operators arising from a differential expression. This class of operators 
will be the subject of the rest of the paper, and in § 2 we show how these 
operators’ domains are determined by suitable boundary conditions. In § 3 
we restrict to a narrower class of differential expressions in order to obtain 
more detailed information. For this restricted class we have a problem, which 
differs from the known case only in the nature of the boundary conditions. 
In § 4 we show that operators of this restricted class, which are formally 
self-adjoint, give rise to L2 expansion theorems, and in § 5 we consider a 
few examples. 


1. Differential operators and adjoints. We shall consider operators 
on L,(J) (1 < p < @) for any interval J = [a, 6] (where a or 6} or both may 
be infinite), which are generated by expressions 


n 


(1.1) r= >> p,(x)D™, 


j=0 


where D = d/dx and p,; is a complex-valued function belonging to C"~/(J). 
We define the adjoint differential expression by 


Received January 26, 1960. This research was carried out while the author held a Fellowship 
at the Summer Research Institute of the Canadian Mathematical Congress. 


316 




















SINGULAR DIFFERENTIAL OPERATORS 317 


(1.2) m= Do gs(x)D™ 
j=0 
j 
Qs = 2 (—1)"*C{D pe. 


Note that r* is an operator of the same type as r, and that it is the conjugate 
of the usual Lagrange adjoint. This modification is made for convenience in 
dealing with Banach space adjoints, and when we consider formally self- 
adjoint operators in § 4 we shall return to the more usual notation. 

When one assumes that po(x) # 0 on the interior of J, the definition of 
minimal and maximal operators on L,(J), which are associated with r, is 
relatively direct (see, for example, Rota (8)). However, as we do not wish 
to make this assumption some modifications must be made. Denoting by 
K,(J) the space of all m times continuously differentiable functions which 
vanish outside a compact subset of J we define the operators 7“ (r, p, J) on 
L,I) for l1< p<@ by T(r, p, Df = rf for f € K,(I). Note that rf is 
continuous on J, and vanishes outside a compact subset if J is unbounded, 
so tf € L,(J). 

Now for any p > 1 7“(r, p, I) has an adjoint 7,(r*,¢, J) on L,(J) with 
domain D,(r*, g, J), and 7;(r*, g, J) has an adjoint 7o(r, p, J) on L,(Z) with 
domain Do(r, p, J). Clearly, To(r, , J) is an extension of the closure of 
T(r, p, I). It will, in fact, be the closure unless p = . 

We shall now show how 7;(r*,q, J) is related to r*, and give a charac- 
terization of D,(r*, g, J). First we define 


(1.3) rof = po(x)f(x) 
retf = D(rif) + (—1)"* dags(x)f(x), & = 0,1,..., 2 — 1. 


THEOREM 1.1. D,(r*, g, J) = {f € Le(Z)\re*f is absolutely continuous on any 
compact subset of I,k = 0,1, m—1; ra*fEL(D)}. Also, for 

- Dy(r*, 9, D1), Tilr*, 9g, Df = (- 1) n7,*f, and if f © C" on a neighbourhood 
of xo then T,(r*, q, I)f = r*f on that ec 


Proof. By definition f € Di(r*, 4 ) if and only if f € L,(J), and there is 
1 € L,(J) such that for any g € yA we have 
(1.4) J Ure — fig)dx = 0. 
I 


Given such a g there is a compact interval [c,d] C J outside of which g is 
zero. Thus if ¢ = D"g we have 

y 6 i — £) i k—1 
o(n-—k—- 1)! 





(1.5a) D‘g = o(t)dt,k = 0,1,...,n —1; 


vd 
(1.5b) it 5 eae 


and may rewrite (1.4) in the form 














318 R. R. D. KEMP 





—_ j-1 
0 = f | poteyeote) + > eve) f FP oa 





—fi(x fest eae r ~ #(enae | dx. 


When we interchange the order of integration in each term which involves 
two pace this becomes 


0= "9 Eo + D> e-irerer 


“e- 





=f o(x)v(e) ae. 


Now we may choose any ¢(x) fulfilling (1.5b) and define g € K,(J) by (1.5a) 
(for k = 0) inside [c,d], and by 0 outside [c, d]. Thus (x) is a function in 
L,(c, d) which is annihilated by all functions ¢ fulfilling (1.5b), and it must 
be equal almost everywhere to a polynomial of degree n — 1. 

Thus we have 


d j-1 
(1.6) polxif(e) + VG 2 5 (eyf(e)d 


j=l Vez 


“4(E — 1 
(n Snr fule)dt = Prs(z) ae. on [c,d]. 


At this point in the known case, one alters f on a set of measure 0 so that 
(1.6) holds everywhere on [c,d]. We may do this except at points where 
Po(x) is zero, and shall assume that this has been done. Thus (1.6) holds 
except at a subset A of measure 0 of Qo = {x|fo(x) = 0}. Thus for any 
xo € A we have 


lim po(x)f (x) 


z—7Z0 

existing, where the limit is taken along a sequence of points not in A. As 
f € L,(J) it cannot be infinite on an open set, so if x» is an interior point 
of No, or if it is the limit of interior points of Ito, this limit is 0. In any case 
we see that (1.6) can fail to hold only at points where po(x) = 0 and f(x) is 
infinite, so the definition of the product was in doubt in any case. We define 
its value to be this limit, so that (1.6) will hold everywhere in [c, d]. 

Thus ro*f is absolutely continuous on [c, d]. If 7,*f exists and is absolutely 
continuous on [c, d] then (1.6) yields 


n wd j-k-1 
rif + (-'d Jt = —- ps(é)f(E)dE 


! 
j=k+1 1)! 








x ‘a k-1 
yr fe — — —— f,(¢)dt = D*P,_1(x) on [c,d]. 








SEFC 











ow Ww ef DW 








SINGULAR DIFFERENTIAL OPERATORS 319 


Thus 


n d -_- ots 
Doi) + (- 1) pavateype) + (-" Df FP psoas 
d (é ae yr 
-(n—k— 2)! 


However, this means that the derivative 7,,,*f of an absolutely continuous 
function 








— (—1)*" filt)dt = D**'P,_,(x) a.e. on [e, d]. 


f+ (—1 J peaslOpleae 


is equal a.e. to an absolutely continuous function. It must thus be equal 
everywhere, and 7;4,*f exists and is absolutely continuous on [c, d]. 

Therefore 7,*f, for k = 0,1,...,2 — 1 are absolutely continuous on [c, d] 
and r,*f = (— 1)"f;a.e. on [c, d]. As we could begin with any compact [c, d]CJ 
this completes the proof of the description of D,(r*, ¢, J), and of the action 
of 7,(r*,¢, I) on this domain. The fact that 7;(r*,¢, Df = r*f if fe c 
follows from an easy computation. 

For 1 < p <@ it follows (see, for example, Rota (8)) that 7 (r, », J) is 
the closure of 7“ (r, p, J), and is thus a restriction of 7;(r, p, J). For p = @ 
the graph of 7)(r, ©, J) is the closure of the graph of 7“(r, , J) in the 
Li(J) @ L,(J) topology of L..(J) ® L,,(J). Thus 7o(r, ©, J) will, in general, 
be a proper extension of the closure of 7“(r, ©, J). However, the graph of 
T,(r, ©, I) is also closed in the L,(J) ® L,(J) topology of L.(J) @ L.(J), 
and contains the graph of 7’ (r, ©, J). Thus again Ty(r, ©, J) C Ty(r, ©, J) 
and we have 


Tk. D-Tie’.eD 1¢<9<0,¢e—-- 


T%+.9.D=Téh*.41D 1¢9<0,9¢e— 


To(r, p, I) C Ti(r, p, I) 1 <qp< ”. 


We shall consider closed operators T on L,(J) such that T7o(r, », J) C T CT, 
(r, p, I). These will be called differential operators associated with 7, or 
r-operators. Clearly the adjoint of a r-operator on L,(J) is a r*-operator 


on L,(J). 


2. r-Operators and boundary conditions. The direct way of specify- 
ing a r-operator JT on L,(J) is to give its domain D(T) as a subspace of D, 
(r,~, I), which contains Do(r,p,7). We note that under the norm 
fllee = ||Tilr, D, Dfl|lp + |If\|, (called the r-norm on L,(J)), Di(r, p, J) isa 
Banach space, and Do(r, p, J) is a closed subspace. In order that T be closed 
D(T) must also be a closed subspace of D,(r, p, J) under this norm. Clearly 











320 R. R. D. KEMP 


we could also specify D(T) by giving the subspace of D,(r, p, J)/Do(r, p, I) 
onto which it projects under the natural projection of D,(r,,J) onto 
Di(r, p, D)/Dolr, b, D. 

As the image of an f € D,(r,p, JZ) in the space D,(r, p, I)/Do(r, p, I) 
represents the portion of f not in Do(r, p, J) it also represents the way in 
which f fails to be appropriately zero at the “boundary” of 7. Thus we shall 
call this projection of f the boundary value (or boundary values) of f, and call 
the space D,(r, p, I)/Do(r, p, I) the space of boundary values for'r on L,(J). 
This makes it natural to define a boundary condition for r on L, (J) as an element 
of [D,(r, p, D)/Do(r, p, D))*, which is thus the space of boundary conditions 
for r on L,(J). 

Clearly, a boundary condition F for r on L,(J) is completely specified by 
a linear functional on D,(r, p, J) which vanishes on Do(r, p, J) and is con- 
tinuous in the r-norm on L,(J). We shall also denote this functional by F. 


THEOREM 2.1. If F is a boundary condition for r on L,(I) there is g © D, 
(r*,¢g, I) such that for any f € D,(r, p, I) 


FQ) = J (ule, p, DFE) — Fe)Tule*, @ Dale)] ae. 


Thus the space of boundary conditions for r on L,(I) is isomorphic to the space 
of boundary values for r* on L,(1). 


Proof. We see immediately that 


(f,g) = Jine. b, I)fg — fTi(r*, g, Ig) dx 


is a bilinear form on D,(r, p, I) X D,(r*, ¢, I), which satisfies the following 
conditions: 


(f,g) = 0 for all f € D,(r, p, J) if and only if g € Do(r*,¢, J), 
(f, g) = 0 for all g € D,(r*, ¢, J) if and only if f € Do(r, p, J), 
Kf, al < |lf|lollellee.e- 


Thus (f, g) induces a continuous, non-singular bilinear form on [D,(r, p, I)/Do 
(r, p, TD) & (Di(r*, a, D/Do(r*, ¢g, 2), and if F is any non-zero boundary 
condition for r on L,(J) there is f € D,(r, p, I)/Do(r, p, I) for which F(f;) ¥ 0. 
Now for any f € D,(r, », I)/Do(r, p, TD) we have f = F(f)f:/F(f1) + fo where 
F(fo) = 0. If RN consists of all g € D,(r*, g, I)/Do(r*, g, 1), which annihilate 
the null-space of F, there must exist g; € MN such that (f:, g:) = 0 or N would 
annihilate all of D,(r, p, I)/Do(r, ¢, J) and (f, g) would be singular. Then 
F(f) = (f, g2) where go = F(f1)g:/(f1, 1). 

The mapping F — g: is clearly an isomorphism so the proof is complete. 

This representation of boundary conditions allows us to show in what sense 
boundary values and boundary conditions are related to the boundary of 
the interval J. 








nse 
of 





SINGULAR DIFFERENTIAL OPERATORS 3: 


to 


THEOREM 2.2. If po(x) # 0 om [x;, x2] C I, wherea < x, < x2 < 3, the value 
of (f, g) depends only on the values of f and g on the set I — |x , x). 


Proof. Since the continuous function po(x) is non-zero on the closed interval 
[x1, X2] there is a larger closed interval [x;’, x2"] such that a < x)! < x, < x2 < 
x2’ < b, on which po(x) # 0. Thus for f € D,(r, p, J), g © Di(r*, g, DT) we have 


Ged= Jo (tals. p, Die — sTule*, a, Del de 
© 1—(21'.22" 


+ J [Ti(r, p, D)fe — fT1(r*, g, Dg) dx. 
(z1’. 22") 


However on [x;,’, x2'] it is clear that f and g have absolutely continuous deriva- 
tives up to order m — 1 and 7,(r, p, J)f = rf, 7T1(r*,¢, Dg = r*g on this 
interval. Thus the second term above is given by 


j [rf g — f r*g| dx = [fg](xe’) — [fg] (x1), 


where 
n—1 n—j 
[fel(x) = > Sd (-1)*" D™* f(x) D?"(p,(x)g(x)). 
j=0 k=l 


Thus (f, g) is given by an expression fulfilling the requirements of the theorem. 


CoROLLARY 2.1. (f, g) depends only on the values of f and g in the neighbour- 
hood of No = {x|po(x) = 0} and in the neighbourhood of the endpoints of I. 


THEOREM 2.3. If po(x) = 0 om (x1, x2] C I then the value of (f, g) depends 
only on the values of f and g at points outside |x, x2|, at its endpoints, and in 
the neighbourhood of points imside (x, X2) where p,(x) is zero. 


Proof. We may assume that fo(x) is not identically zero on any interval 
containing [x,, x2], although this does not alter the proof. The conditions 
which the restriction of f € D,(r, p, I) to [x,, x2] must satisfy are: rof dof =0, 
tif = — gif absolutely continuous, etc. Thus it is clear that on [x,, x2] we 
are really dealing with the operator 


Tis\.3] = > p,;(x)D"’, 
j=l 


and we may apply Theorem 2.2 to the boundary form (f, g) for this operator 
to obtain the desired result. 

This reduction process can clearly be repeated to arrive at a complete 
characterization of the points in J at which the values of f and g are relevant 
to the values of (f, zg). We use the notations NR, = {x|p,(x) = 0,7 = 0,1,...,k} 
and No for the interior of %,, and combine our results in the following theorem. 


THEOREM 2.4. The value of (f, g) depends only on the values of f and g in the 
neighbourhood of the set B = {x € I\x is an endpoint of I, or there exists an 
integer k between 0 and n — 1 such thatx © N., x ¢ Nel. 














322 R. R. D. KEMP 


Thus for the differential operators r and r* the set 8 plays the role of 
the boundary of the interval J. The boundary values of a function in D,(r, p, I) 
or D;(r*, g, I) depend only on the values of the function in a neighbourhood 
of %. Thus there are boundary conditions for r on L,(J), which depend on 
values of the function away from the endpoints of J. This is the main signifi- 
cant difference which arises in our general class of operators. 

Another remark might be made at this stage. If f © C*(J) (\ L,(J) and 
g€ CD O\L,()) it is easily seen that (f, g) depends only on the values of 
f and g near the endpoints of J. If f, and g, are sequences of such functions 
which converge in L,(J) and L,(J) to fo and go respectively, and if 7f, and 
r*g, converge in L,(J) and L,(J) to fo* and go* respectively, it is clear that 
fo € Dilr, b, DT), go € Dilr*, 9, DT), Tilt, &, Dfo = fo*, Ti(r*, 9, Dgo = go*, and 
that as m and m approach © (f,, gn) converges to (fo, go). Thus it is clear 
that (fo, go.) depends only on the values of fy and go in the neighbourhood of 
the endpoints of J. Thus 7;(r, p, J) and 7;(r*, g, J) are not given, in general, 
by the closures of r and r* on C*(J) (\ L, (J) and C*(J) (\ L, (J) respectively. 


3. Regular operators. In order to obtain more specific results it seems 
to be necessary to restrict the class of operators somewhat. The natura! 
restriction to make, and one which we shall make throughout the remainder 


of this paper, is that 8 should be finite, consisting of {xo, x1, ..., 2m}, where 
xo and x, are the endpoints of J, and either or both may be infinite. We shall 
denote by J, the interval [x,1, x,] 7 = 1,...,m. 


It is necessary to restrict somewhat further than this however. The essential 
spectrum of an operator T is the set {A|T — d does not have closed range}, and 
we shall define the essential spectrum of r to be the essential spectrum of 7» 
(r, p, I) and denote it by o,(r, p, I). The essential point spectrum of + is the 
point spectrum of 7o(r, p, J) and will be denoted by Po,(r, p, J). The essential 
resolvent set p,(7r, p, I) is the complement of 


o.(r, p, I) U Pao, (r, p, I), 


and we shall say that r is a regular operator if & is finite and if p,(r, p, I) is 
non-empty. 


THEOREM 3.1. Jf I' and I* are two subintervals of I such that I' (\ I? is a 
single point and I' \ I? = I, then 


o,(r,p, I) = o,(T, pb, I’) Ue,(r, p, I’) 
and 


Pa, (7, p, TI) D Po, (7, p, I’) U Po, (r, p, I*). 


Proof. lf x is the point common to J' and /? it is finite, so for f € Di(r, p, J), 
D,(r, p, I’), or Di(r, p, I*), it follows that 7,f for k = 0,1,...,2—1 hasa 
finite limit at x. Thus if f € Do(r, p, I") and To(r, 0, I')f = Xf it follows that 
for any g € C"(J'), which is zero on a neighbourhood of x», we must have 


























SINGULAR DIFFERENTIAL OPERATORS 323 


o 
Il 


J tr. p, I’) fg — fr* g) dx 


DY (=1)*** r9-af (& — 6) g°-'(% — ©). 


«0+ k=l 


ll 
5 


Thus for k = 0,1,...," — 1 the limit of f(x) as x — x is zero, and the 
function f,; which is equal to f on J' and is zero on J* belongs to Do(r, p, J) 
and has the property that 7 (r, p, J)f; = Afi. Thus 


Pa, (r, P, I’) .. Po,(r, P, I), 
and the proof for J* is precisely the same. 


For the other part of the theorem we note that as the null-space of 
To(r, p, I) — X is contained in the direct sum of the null-spaces of 7;(r, p, 
I;) — 4, which are finite dimensional, the former must be finite dimensional 
also. Similarly the null-spaces of T(r, p, J‘) — \ and TJ (r, p, J?) — A must 
be finite dimensional. Thus results of Rota (8) yield our conclusion. 

We might note that these results depend only on % being finite, and not 
on p,(r, P, J) being non-empty. There are several immediate consequences 
of Theorem 3.1 which we list as corollaries. 


Coro.uary 3.1. Jf B is finite 


o.(t, Pp, I) = Ua-(r, p, 15) 
jan 


and 


m 


Po,(t, p, I) D U Pa. (tr, p, 15), 
y= 1 


and if p.(r, p, I) is non-empty, we have p,(r, p, I;) non-empty forj = 1,..., m. 


COROLLARY 3.2. o,(r, p, I) is closed and equal to the essential spectrum of any 
t-operator if p.(r, p, I) is non-empty. Also o,(r, p, I) coincides with o,(r*, 9, I) 
wl<p<e. 


Corollary 3.1 is obvious and Corollary 3.2 follows from corresponding 
results of Rota (8) for the usual case. One cannot state that Po,(r, p, I) is 
closed, and it is quite possible for Po,(r, », J;) to be empty for j = 1,...,m 
and yet have Po,(r, p, J) non-empty. 


THEOREM 3.2. If + is regular on L,(I) or if B is finite and p,(r, p, I,) is 
non-empty for] = 1,2,...,mthen D,(r, p, I)/Do(r, p, I) is finite dimensional. 


Proof. We note that by Corollary 3.1 7 regular on L,(J) implies that 
p.(r, b, Ty) is non-empty for 7 = 1,2,...,m. It is the latter which is the 
necessary hypothesis here, and it does not imply that p,(r, », J) is non-empty. 

Now for each j there is an integer k, such that p.(x) = 0 on J, for a < ky 
and ~,;(x) # 0 for x in the interior of 7, Thus every f € D,(r, p, J) belongs, 











$24 R. R. D. KEMP 


when restricted to J,, to the domain of an operator of order n — k,, which 
we shall still denote by r. On J, 7 satisfies all the hypotheses required by 
Rota (8), so D,(r, p, I;)/Do(r, p, I;) is finite dimensional. Let x, be the 
projection of D,(r, p, ;) onto D,(r, p, 1;)/Do(r, p, I,). Uf we define the trans- 
formation x from D,(r, p, J) to 


m 


+ @ D,(r, P, I;) Do(r, p, I;) 


j=1 
contained in Do(r, p, 1) so that D,(r, p, I)/Do(r, p, J) is isomorphic to the 
quotient of rD,(r, p, J) by rDo(r, p, J), which must be finite dimensional. 

The previous discussion has essentially consisted of proofs that a regular 
operator possesses properties very similar to those of the usual case. In fact 
they possess sufficiently similar properties to carry over unchanged the results 
on extensions proved by Rota in (8). 

However, further questions naturally arise. It would simplify the problem 
immensely if any r-operator had the property of commuting with the pro- 
jections P, defined by Pf = x,f where x, is the characteristic function of J,. 
Even if this is not true one might wish to examine the nature of the operator 


by «f =(mif, wof,..., mf) we see immediately that the null space of 7 is 


T, on L,(/,) arising from a r-operator T by 7 ,P,f = P,Tf for f € D(T). One 
can hardly expect that 7, will even be closed in general, for all the functions 
f in its domain have r,-;f absolutely continuous on compact subintervals of 
I, which contain the end point of J,;, which is interior to J. This condition is 
not satisfied by all f € D,(r, p, Z;) in general. For the particular case when 7 
is T,(r, p, I) one might hope that the closure of T,,(r, p, J) would be 7;(r, p, I,). 


THEOREM 3.3. The second adjoint of T;,(r, p, I) is Ty(r, p, I;). 


Proof. We shall assume that the terms of + with coefficients identically 
zero on J, have been omitted and thus that po(x) ¥ 0 except at x,_; and 
x,. Thus, if g belongs to the domain of the adjoint of 7; ,(r, p, J) it has absolutely 
continuous derivatives up to order m — 1 on any compact interval properly 
contained in J, Thus for any f © D,(r, p, J) and C®-functions ¢; and @e, 
such that ¢; + ¢2 = l, 


. 3 x 
¢1 = 1 for x 4 : ma : P 
and 
3 x 
¢, =0 for s>= + =i! 


we have 
Q = (f, 21; = (dif, Z)1; 1+ (bof, Z)1;- 


As ¢;f and ¢2f both belong to D,(r, », 7) we must also have 0 = (@,/, g); 
(def, g)7,, and thus 











we 


SINGULAR DIFFERENTIAL OPERATORS 32: 


o 
II 


lim > (—1)***"s,_2[ 61 Xj1 + Of (x1 + )ID g(xj-1 + €) 


504+ kel 


lim DO (—1) ry af (xy-1 + )D "g(x, + ©), 


40+ kel 


and 


n 


0 = lim yw (—1)"**"+,_.[2(x, — e)f(x, — €)]D* "g(x, — ©) 


404+ kel 


lim . a (—1)” Othe af (x3 — e)D* "g(x, — «). 


50+ k=l 


To show that these conditions imply that g € Do(r*, g, J;) we must show 
that if these conditions hold for f € D,(r, p, J) they also hold for f € D, 
(r, p, I;). If xy. or x, is an endpoint of J this implication follows immediately 
at that endpoint. For, let f € D,(r, p, J;) and x,, = xo. Thus the function 
f which is ¢:f in J, and 0 outside J, belongs to D,(r, p, J) and 


lim > (1 raf (xp + eg (xya + ©) 


20+ kel 


= lim > (— 1) raf (205-1 + e)g(xp1 + €) = 90 


«40+ k=l 


The proof if x, = x,, is similar. Otherwise both x, and x,_, are finite. We shall 
treat only the condition at x,_, as the treatment of the other is similar. If 
“¢ I then 


k-1 . 

o. (it — =z ) 

Tn—«f (x) = Tn r+u) is )- pl me 
v= 0 ! 


+ * JG soneewee CLUE) aye (—1)"""qn_v(é 1 — i. ~ |p} ae 
Jol @— Dp! ™ a SS) G1 — jy 
which clearly exists for x = x,_,;. Thus we can use this expression for x° = x,_, 


whether f € D,(r, p, J) or D,(r, p, I;). Now if 


lim ¢D“g(x;1 + «-) #0 


€.404 


g can hardly belong to L,(J;) so 


lim > (1) raf (xy + €)D* "g(x j-1 + €) 


; n ; Sei j z 1 
= lim ye (—1) “Tn x} (Xj-1) + “IK? Xj-1 tT €, &)tyf (E) 


{ 
+ Ko" (x;-1 + «, £)f(E) dé ¢D’ g(Xj-1 T €). 











326 R. R. D. KEMP 


It is also easy to verify that g € L,(J,) implies that 


lim D*~*g (x5. + of (KY? (xs-1 + €, &) raf (E) + KP (xy + , &) 


e50+- 
f(é)|dé = O. 
Thus the condition reduces to 


0 = lim Zz. (17 raf (x51) D* *g (x; 1+). 


We must show that if this holds for all f € D,(r, p, J) it also holds for all f 
in D,(r, p, I;). This amounts to showing that the admissible values of the 
vector (rof(xj~1), rif(xy-1), .-., Ta-1f (x y-1)) are the same for f € D,(r, p, J) 
and for f € Di(r, p, J;). A change in this vector amounts to changing 


n-—1 —— \P ar 
tof (x) = > Tf (x41) = %e3) +f [Ky (x, &) taf (&) + Kz” (x, &)f(€) ]dé 


! 
val) Vv. 





by the addition of a polynomial R(x) of degree m — 1. As this addition could 
be modified by a C* function outside of J,, the admissible class of polynomials 
in both cases consists of those for which R(x)/po(x) belongs to L,(J). Thus 
the admissible values of the vector are the same in both cases and we have 
succeeded in proving that the adjoint of T1,(r, p, I) is To(r*,g, I). This 
completes the proof. 


Coroiary 3.3. If p ¥ © the closure of T,,(r, p, I) is Ty(r, p, I)). 


For a more general r-operator 7 the domain D(T) is determined by a 
finite set of boundary conditions a, a2,..., a, Even if we assume that 
these are separated in the sense that each one depends only on values of a 
function near one point of %, it does not follow that any of them can be 
considered as boundary conditions on any J,. This is due to the fact that a 
boundary condition at a point x, of %, which is interior to J, will usually 
depend on values of a function on both sides of x;. If none of the boundary 
conditions determining D(T) can be considered as boundary conditions on 
I, then it is clear that the second adjoint of T, will again be 7;(r, p, Z,) 


jj- 
P| 


One can give a more definitive answer to the question of permutability with 
the projections P;. 


TuHeoreoM 3.4. Jf To(r, p, J) C T C Tilt, p, 1) then P;T = TP, if and only 
if P;D(T) = Do(r, p, J;) for j = 2,3,...,m—1 and F(P;f) = 0 for any 
f € D(T) and boundary condition F on I, or In, which depends only on the 
values of a function near x, OF Xm_, respectively. 


Proof. If P; is to leave D(T) invariant it is clear that f € D(T) implies 
xuf € D(T) so mf(x;) =0 for k=0,1,...,m—1; fj =1,2,...,m—1. 
Noting that 7,(7*,q¢, J,) is the closure of r* on C*“*j(J,) (\ L,(J;) we see 
that for g € C**j(J,) OL,(1;), (f, g)1, depends only on values of f and g 





YY ww 








SINGULAR DIFFERENTIAL OPERATORS 327 


near X» Or x» if either belongs to J,, and is zero otherwise. Thus clearly 
xaf € Do(r, p, I;) for 7 = 2,3,...,m — 1, and F(P,f) = 0 for any boundary 
condition F on J, or J, which depends only on the values of a function near 
xX, OF X,~-1 respectively. The converse implication is trivial. 

Thus if 7 is to commute with P, the boundary conditions must be chosen 
with great care, and even this will be impossible unless P,Do(r, p, I) = Do 
(r, p, I,) for 7 = 2,3,...,m. 


4. Formally self-adjoint operators on L:(J). In dealing with the L» 
case we shall conform to the standard Hilbert space notation and r* will 
now denote the conjugate of (1.2). As the g,’s are now conjugates of those 
in (1.2) the expressions for r,*f in (1.3) must have conjugates on the ;’s. 
When we speak of adjoints of operators we now mean the usual Hilbert 
space adjoint. 

We shall assume henceforward that r = r*. Thus 79(r, J) = T(r, J, 2) is 
a symmetric operator and 7 (r, J) C 7y(r, I) = To*(r, J). As usual we 
are interested in discussing the spectral resolution of operators 7 such that 
To(r, J) C T C Ti(t, J), and particularly in those which are maximal sym- 
metric or self-adjoint. The general theory of such problems has been con- 
sidered by many authors, so we shall only consider a particular aspect. Cod- 
dington (2, 3, 4, 5) has shown how the resolution of the identity of a self- 
adjoint extension, or the generalized resolution of the identity for a maximal 
symmetric extension, can be expressed as integral operators in such a way as 
to vield an expansion theorem and Parseval equality; provided that the 
resolvent, or generalized resolvent, is an integral operator of Carleman type. 
This is also related to the work of Mautner (6), Bade and Schwartz (1), and 
Nelson (7) on eigenfunction expansions. 

We shall prove only the following theorem, which allows one to apply these 
results to obtain an expansion theorem. 


THEOREM 4.1. If B is finite, To(r, I) C T C Ti(r, DT) where T is maximal 
symmetric or self-adjoint, then the generalized resolvent or resolvent of T is an 
integral operator of Carleman type. 


Proof. On I, 7 gives rise to a problem of the type considered by Coddington 
(3, 4), who showed that there are either maximal symmetric extensions 
possessing generalized resolvents, which are integral operators of Carleman 
type, or self-adjoint extensions with resolvents having the same property. Let 
G,(x, &, 4) be the kernel of such a resolvent or generalized resolvent. 

Now the essential spectrum of 79(r, J), and thus of 7;(r, J), is contained 
in the real axis, so we can apply a result of Rota (9) to construct for dm\ > 0 
an orthonormal basis 


‘Pie dr), 2+» Go-.(x, A) 


of the null-space of 7,(r, J;) — A, and a similar basis 











328 R. R. D. KEMP 


1” (x, A), ..., G52 (x, A) 
for dm < 0. These bases will be analytic in X. 
Now if g € D,(r, J) and (7,(r, J) — A)g = f we see that on J, g can differ 


from 


j G(x, &, A)f(E)dE 


only by a linear combination of ¢,‘”(x,) (k = 1,2,... , w,(A); 
for dm\ > 0 and = w,- for dm\ < 0). Thus 


@j (A) 


(4.1) g(x) = > a ny" (x, d) +{ G,(x, &, A)F(E)dE, x € T;. 
k= Tj 
Since g € D,(r, I) we must also have 


. A) 
re | G(x; _s 0, g, r)f(E)dE + > A jet e ox” (x; _ 0, ) 
T k=1 


@j +1 (A) 


re Gyii(x, + 0, & A(E\dE + 2. O5+1k tebe (x, + 0, dA) 
j+i ke 


for 7 = 1,2,...,m—1; e =0,1,...,"—1; where 


Tee (x; — 0, dr) 


and 


tebe (xy + 0, d) 


clearly exist as the x,’s involved are finite. Similarly 7r,G,(x,; — 0, £, A) and 
7G 541(x, + 0, &, A) exist and are in L2(J;) and Le(J441) respectively as func- 
tions of £. These identities determine certain of the a,,’s in terms of certain 
others and in terms of such expressions as 


| Td4a;(x, — 0, &, A)f(E)dE. 
Tj 


Thus we may rewrite (4.1) in the form 


@(A) . 

(4.2) g(x) = > a, dy (x, A) + Jee, &, A)f(E)dE, 
where the a,’s are the a,’s which remained undetermined above, and 
G(x, —&, A) € Le(J) as function of £. Clearly the functions ¢,(x, A) belong to 
the null-space of 7;(r, 7), and we can assume that they form a basis for 
this space which is orthonormal and analytic in \X. 

Now if dmy\ > 0, w(A) = w+, and if dm <0, w(A) = w —. We may 
assume w + <w-— so that a maximal symmetric extension of 7 (7, J) is 
determined by an isometric V from N(7,(r, J) — 7) into N(T,(r, J) + 4) in 


the following manner 


D(T,) = tf € Dilr, Dif = fo + UI —V) fa. fo € Delt, Df. € RIMi7, I — vd} 














SINGULAR DIFFERENTIAL OPERATORS 329 


then 
D(T,) = {f € Dilr, DIf = fo + (I — V*)f_, fo € Do(r, I), f 
MN(Ti(r, J) + 72)}. 


This clearly amounts to imposing w + boundary conditions, and allows us 
to determine from (4.2) a Carleman kernel G,(x, £, 4), which is a kernel for 
the generalized resolvent of the maximal symmetric operator 7,,. 


5. Examples. Here we shall discuss briefly three examples of operators 
which are formally self-adjoint. 
(a) If ry = (xy’)’ on J = [—1, 1] one finds that 8 = {— 1,0, 1} and 


U, 2.) = ay(g)ae( (f) — ae(g)aiJ y+ az Z)as (f) — as(g)as(f) + as(g)ac(f) 
— ag(g)as), 
where 
ai(f) = f(1), ao(f) = f’(1), as(f) = lim xf’(x) = 7,f(0), 
z0 


a4(f) = lim [f(x) — f(—x)], as(f) = f(—1), and ae(f) = f’(—1). 
z+0+ 
Thus this operator of the second order requires three boundary conditions 
to determine a self-adjoint extension on L2(J). 
With the boundary conditions a;(f) = a3(f) = as(f) = 0 we obtain a self- 
adjoint extension on L2(J) which commutes with the projections P; and P». 
On the other hand, with the boundary conditions a,(f) = a4(f) = as(f) = 0 
we obtain a self-adjoint extension which does not commute with P; and P2. 
We might also note that this differential expression has particularly simple 
properties as the equation (7,(r, p, J) — A)f = 0 has three linearly. independ- 
ent solutions, which are entire functions of \ for any p <@. These are 


u(x, A) = § Jo(2(—ax)}) x>0 
0 <Q 

: ) 0 x>0 
ua(x, d L Jo(2(—dx)') x<0 


9) 1 
u3(x, A) = ¥o(2(—Ax)) _ .- Jo(2(—dx)*) (log (—A)* + ¥y) 


2 ( 
= S log x| Jo(2(—dx)*) — => er 3 O(n), 
T 
where 
— | 
o(n) = >> -. 
tai R 


For p = © only u(x, A) and ue(x, A) belong to L.,(/). However, in this case 
as(J) = ag(J) = 0 automatically. 











330 R. R. D. KEMP 


(b) If ry = (x*y’)’ + 4y on J = [— 1, 1], one finds that B = {— 1,0, 1} 
again, but 


(f, 2.) = ar(g)ae(f) — a2(g)ar(f) — as(g)as(f) + aa(g)as(f), 


with the same notation as in (a). Here any extension defined by separated 

boundary conditions commutes with the projections P; and P», but extensions 

given by non-separated boundary conditions will not have this property of 

course. The spectrum is purely continuous (A > —}) for separated boundary 

conditions, but there may also be point spectrum in the general situation. 
(c) If 


(xy)! + dy’ x>0 
watt: . on J = [— 1,1] 
’ liy x <0 
one again finds that 8 = {— 1,0, 1} but here 


- I -__—— ——_—— — -=— ee 
(f,g) = (rfg — frg)dx = f’(1)g(1) — f(1)g’(1) + f(1)g(1) — if(—1)g(—1). 
J 1 


Thus 7 (7, J) has no self-adjoint extensions on L2(J), but a maximal sym- 
metric extension will commute with P; and Pz, if its domain is defined by 
separated boundary conditions. 


REFERENCES 


1. W. G. Bade and J. F. Schwartz, On Mautner's eigenfunction expansions, Proc. Nat. Acad. 
Sci. U.S.A., 42 (1956), 519-525. 

2. E. A. Coddington, The spectral matrix and Green's function for singular self-adjoint boundary 
value problems, Can. J. Math., 6 (1954), 169-185. 


ee On self-adjoint ordinary differential operators, Math. Scand., 4 (1956), 9-21. 
4. ———— On maximal symmetric ordinary differential operators, Math. Scand., 4 (1956), 22-28. 
5. ———— Generalized resolutions of the identity for closed symmetric ordinary differential 


operators, Proc. Nat. Acad. Sci. U.S.A., 42 (1956), 638-642. 

6. F. 1. Mautner, On eigenfunction expansions, Proc. Nat. Acad. Sci. U.S.A., 39 (1953), 49-53. 

7. Edward Nelson, Kernel functions and eigenfunction expansions, Duke Math. J., 25 (1958), 
15-28. 

8. G. C. Rota, Extension theory of differential operators 1, Comm. Pure and Applied Math., 
11 (1958), 23-65. 

9. ———— On the spectra of singular boundary value problems, M.1.T. Note (1959). 


Queen's University 




















PROPERTIES OF SOLUTIONS 
OF PARABOLIC EQUATIONS AND INEQUALITIES 


M. H. PROTTER 


1. Introduction. In this paper we shall be concerned with two prob- 
lems: (i) the asymptotic behavior of solutions of parabolic inequalities and 
(ii) the uniqueness of the Cauchy problem for such inequalities when the 
data are prescribed on a portion of a time-like surface. The unifying feature 
of these rather separate problems is the employment of integral estimates 
of the same type in both cases. 

We consider parabolic operators in self-adjoint form 


ra) - 0 0 
(1) at z, OX; ‘3 Ox; ” ' 


as well as the non-self-adjoint operator 


n 


’ 0 a 

(2) MV=-—- b4--—— ; Dag = Daa, 
7 28,9" Sede “ 

where the coefficients a;,(x,t) = ai; (X1, X2,...,%Xn,¢) are C' functions of x 

and ¢ and the b,,; = 5,,(x, ¢) are C* functions of x and ¢. The portions of the 

operators 


G= >» by 


are assumed to be uniformly elliptic throughout the domain of definition. 

To study asymptotic behavior we consider a bounded domain D in n- 
dimensional euclidean space E, with boundary [. Denote by J(7) the interval 
0 <t < T and by J the half-infinite interval 0 < 1 < @. The (m + 1)-dimen- 
sional product domain D X IJ will be designated by R while S will be the 
portion of the boundary of & consisting of T X J. 

We are interested in the growth of functions u(x, ?) which satisfy in R 
differential inequalities of the form 


n : 2 
2 2 . Ou 
(3) (Lu)” < Cy(t)u" + Co(t) > (#2) 
i=l OX j 
Received February 24, 1960. This research was supported by the United States Air Force 
through the Air Force Office of Scientific Research of the Air Research and Development 
Command under Contract No. AF 49(638)-398. 


331 











332 M. H. PROTTER 


(4) (Mu)? < d,(t)u® + d.(t) © (a. 


1 Ox; 


or more generally the same inequality in integrated form 


f (Lu)* < cy | u-+cC, os > (2) 

D D Ox; 

| (Mu)? < di(t) fw * + de a) f } (2) 
D D D Ox 


The further condition 
(5) a=Q@ on S§ 
will be assumed throughout § 2. However the theorems of that section are 
applicable without change to the condition 

Ou . 
-=(@ on S 
where 0/dv is the co-normal derivative defined in the customary manner. In 
fact with suitable restrictions on p(x, y), g(x, y) the results apply with the 
more general condition 


p(x, y) 3 — ~ + q(x, y)u =0 on S. 


We define the functions 


Ao(t) = sup \2 g(x, t)1 , 
reD ar” 
eo 

P re 

Bo(t) = sup |= bi;(x,t)| , 
zeD ot 
St) & 


| @ 
B,(t) = sup 9 (b;;) 


zeD 


x 
3= 1,2,...,;8 


The starting point of the investigation of asymptotic behaviour is the 
knowledge that solutions of the heat equation 


Ou 
— = Au 
ot 
which satisfy (5) decay as e~‘ for some positive \ as t— ©. This result 


was extended considerably by Lax (1), who showed that for abstract non- 
positive operators N defined in a Hilbert space, and for functions u satisfying 
(5) and an inequality of the form 














PARABOLIC EQUATIONS AND INEQUALITIES 333 


pm 
the rate of decay is again as e~‘, provided certain auxiliary conditions on 
the nature of the spectrum of D and the function C,(¢) are satisfied. Lees (3) 
also investigated the asymptotic behaviour of solutions of differential inequality 
(3) from the abstract point of view and his results apparently overlap with 
those given in § 2. 


< Ci(t)|\al|, 








We shall show that under certain conditions on the functions A(t), Bo(t), 
B,(t), as well as on the functions C,(¢), d,(t), « = 1, 2, solutions of either (3) 
or (4) decay as exp(— AZ") for some positive \ and some 7 > 1 ast— @., In 
case u(x, t) satisfies the differential equation rather than the inequality, that 
is, if C,(t) = d,(t) = 0, « = 1,2, then under natural hypotheses on the 
coefficients the solutions decay as exp(— /) for some positive 4. The methods 
employ ZL. estimates for functions with compact support in ¢ and kernels 
depending on ¢t, but which merely satisfy (5) as functions of x. The estimates 
are in terms of parabolic operators (3), (4). These inequalities are a more or 
less natural development of those given in (5), where the estimates are in 
terms of elliptic operators, and the subsequent ones derived in (2), where 
the estimates are in terms of parabolic operators; but the functions are 
assumed to have compact support in x and t. 

In § 3 the problem of the uniqueness of the Cauchy problem for inequalities 
(3) or (4) is solved when the data are prescribed on a piece of a time-like 
surface. This question for parabolic equations was solved by Mizohata (4) 
using the Calderon—Zygmund method of singular integrals. Here the main 
tool consists of Lz, estimates (with a kernel depending on x) for functions 
with compact support in x and ¢ in terms of operators (3), (4). 


2. Asymptotic behavior. Let (x,t) = v(x, x, 


function defined in R and satisfying the conditions 
(6) 9=Q@ on S$ 
(7) v(x,f) =0 for (x,t) € DX I(T») 


for some 7» > 0. Further it is supposed that for fixed n > 1, for every positive 
\ > 0 and for all 8 the integral 


. Fe ” n De 2 
(8) j eer" } (2) —Oast— o~, 


D i=1 OX; 


Functions v which satisfy (6), (7), and (8) are said to belong to class C(n). 
We note that any function in C(y) satisfies a fortiori the condition 


, 28 21" 2 
lim fs e 'v =0. 
tao VY D 


We define the function 
K = K(6,\, ») = t%e”". 











334 M. H. PROTTER 
Generally we shall employ the letter mo as a generic constant, depending 
only on n and the ellipticity constants in the operators F and G. 


LemMA 1. If v € C(n) we have the inequality 


(9) f | ant — 1)” al Kee, A, nv” 
R t 


< | K (8, , n) (Le)? + mo | 
R . 


Ao(t)K(8, 2) (sey 
R i=—1 C 


IX 4 
Proof. We define the function 
z = K(38, 3A, n)o. 


Then z also satisfies conditions (6), (7), and (8) and hence is in C(n). We 
have 


av Sy : 3 L P 
(2: - Fo) = K(—8, —\, ) {Fz — [z, — (t7* + Ant” ‘)z]}? 
and 

K (8, d, 9) (Lo)? = [Fz — 2, + (Bt-* + Ani™')z]?. 
From the elementary inequality 


(a+6+ c)? > 2d(a +c) 
we obtain 


K (Lv)? > — 22,F2 — 2(6t—' + Ant zz. 


Let R(T) denote the domain D X I(T). Integrating this last inequality over 
the domain R(T) we have 


. ” a , ) 
0 Oz — - : J . 
—2 j 2, > — (c, a) - j (Bt~* + Ant) (2°), < f K(Lv)". 
JR(T) ifm OX; OX; / R(T / R(T 


An integration by parts yields 


(10) J [An(n — 1)t"* — Bt *)]2" < j K (Lv)’ 
R(T “’ R(T 


. oz Oz OA 


for eae 
YRT) i,j OX; OX; ot 





Qi) + J 


where J consists of integrals taken over the boundary of R(T). All such 
integrals vanish because of the boundary conditions except those taken over 
the portion where ¢ = 7. Since z © C(m) these integrals tend to zero as 
t—+ ». Recalling the definition of Ao(t) and noting that K is independent 
of x we have 


dz dz a ; ~ av \* 
— ~ | a t 
j > - - re (a; ) < Mo | Ao(t)K 7 —_— . 
Y R(T) Ox; Ox, Ot J pir | \dx, 




















PARABOLIC EQUATIONS AND INEQUALITIES 335 


g Substituting this in (10), inserting z in terms of v, and letting T7— © we 
obtain (9). 


LeMMA 2. If v € C(n) we have the inequality 


. n 2 . 
(11) j K (8, X, n) = (2) < maf tK (Lv)* + mo j (Ant" ‘4 \a\t") Ko’. 
JR i—1 \OX, K YR 


2 Proof. We consider the identity 


(12) KoLv = = j 


R(T) 


. . 


(Kv"), — | t Kv" — dn j t’ "Ko" 
T) ~’ R(T) 


R(T 


' . Ov ie Ov Ov 
“fF (sande) +f. B onde 


OX; OX; , 


R 


€ 
The ellipticity of the operator F asserts that there exist constants ao, a 
such that 


n n n 
2 - , 2 
a > tt < Dd aggés <a d £ 
i=] i. j=l t= 1 
\ for all real n-dimensional vectors (&, £,...,&). The uniform ellipticity 


simply means that ao, a; are independent of (x, ¢). Hence, integrating the 
identity (12) by parts and employing the above inequality, we find 


| Ky (=) < Mo | K\vLv| + mo|B} j t Kv’ 
R(T) Ox; J R(T) P 


R(T) 
. 


+ modn | t’ "Ko + J. 


) ~’ R(T) 





Again J denotes surface integrals along t = T which tend to zero as 7 —> @. 
We apply Cauchy's inequality to the first term on the right and obtain 
inequality (11) by letting 7 tend to infinity. 

Similar inequalities are obtained with respect to the operator MM. 


LemMA 3. Jf v © C(n) we have the inequality 


, ‘ gi. . 
(13 | | ant — 1)" - alk, A, n)v 
“R 


. . » . av ' 
< | K(Mv)’ + mo} [Bo(t) + Bi|K > (—}). 
JR ~/R 1 Ox 
Proof. We define the function z as in Lemma 1 and obtain 


bu m ' . , 
(2 ~ Ge) = K(—8, —A, n) {Gz — 2, + (Bl ‘+ Anf et? 
Using the elementary inequality 


(a+b+c)? >? + 2b(a +c) 














336 M. H. PROTTER 
we get 

K( Mov)? > — 22,G2 — 2(Bt-' + Ani! )zz, + 2,’. 
Hence integrating over R(7) we find after te tae by parts 


: z 2 “ dz dz 
| (2) +f Ocn(a — 1)e » \s +2 | — Ey) < 
R(T) R(T R(T) i 2X, = Ox, dt 


= Oz 9 
- > 26.) wef K(Mv)* + J, 
R(T) i, j=1 at Ox, Ox; ar) 


where J has its usual meaning. The last integral on the left is dominated by 


. : n ey 2 
Mo | By(t)K > () ° 
/ R(T) =1 \OX; 
We also have the inequality 
2| f de dz <if (s a)" 5 f ~ (2)' 
2 > my | By (t)K ——- J 
23 C4 ) ex, at | ot »s Ox | 
These inequalities combine to yield (13). 


LEMMA 4. Jf v € C(n) we have the inequality 


(14) | K(8, X, 7) > (2) < < mo | tK (M>?) 
/R R 


i=—1 
+ mo | (Ant”* + |ele-* + B,(t))Ko’. 
“’R 


Proof. We consider the identity 

P 7 7 1 7 r_ 2 ‘ ly-_2 7 lz 2 
KvMov = (Kv'),— 8 t Kv — An J t" Ko 

(T) R(T) Y R(T 


. “ “ . n * “ 
0 . ov ‘ ov ov 

os | > - (xb, a2) 4 j 5 Kb, — — 
YR(T) i. Ox; Ox; / R(T 1 Ox, Ox ; 


J 


From the ellipticity condition we have 
‘ -, Ov dv " Te [av \’ 
mo > Kobi; e os > K > — ’ 
Y R(T) OX; OX / R(T) i=] 7 


and from Cauchy’s inequality 


vy : 3) a ; 4 2 
‘ K > ie “ (b;;)v az,| < sf K > (2 2) 4 Mo | KB,(t)v 
iv R(T) OX; R(T) r ~Y R(T) 


Hence, after an integration by parts, the above identity yields the inequality 


. ; n Iv 2 . ; . 
| K 7 (2) < | K|vMov| + mo | (Ant™* + |B\e-* + Bi(t))Ko’+J. 
R(T) ‘t=1 \OX; ~Y R(T) ~/ R(T) 


Inequality (14) is obtained by letting T— ~. 











PARABOLIC EQUATIONS AND INEQUALITIES 337 


LEMMA 5. Let v € C(n), 9 > 1 and suppose Ao(t) = O(t-'). Let v =O in 
D X I(T*) where T* depends on Ao(t). Then for sufficiently large } we have 
the inequality 


P dv \? Pees ° 
(15) vfe "K (6, A, n)v +f ‘K > (=) < Mo | K(Lv)*. 
i ~“R 


Proof. From (9) for sufficiently large 4 and for » > 1, the expression on 
the left in (9) is dominated by 


dm | t" *Ky’® 
R 


and we have the inequality 
(16) rf t* *K(B, d, nv < mo | K (8B, d, 9) (Le) 
R R 


+ mo J A o(t)K (8, X, 0) > (ey 


Replacing 8 by 8 + 3 in (16) we get 


(17) rf t”"*K(8,X, nv < mo | tK (8, X, 7) (Lv)” 
R R 


+ mo J tA o(t)K (8, x, p> (22). 
1 c 


Similarly substituting 8 — 4(, — 1) for 8 in (16) yields 


(18) rf t Ko" < mo j t'"K (Lv)* + mo j 
JR e 


1 
uf "4(t)K > (2 ) / 


If (17) and (18) are inserted into the right side of (11) we find 


(19) J K(f, X, n) > (2) < mo | 1K(6, r, n)(Lo)’ 
. n ov 2 
+ Mo | tAy(t)K(B, A, ") > (. .) . 
JR i= 1 OX; 


We now replace 8 by 6 — 3 in (19) and add the result to (16). This gives 
. ae sas . = Wy a 
r | t” ‘Ko’ + | [t-* — 2moA o(t)]K > (J .)* < mo J K(L)’. 
YR BR 1 Ox 


Since by hypothesis we have Ao(t) = O0(¢-') we may select 7* so large th: 
1 — 2moA (t) > 3 for all ¢ > T*. With this choice of 7* (15) follows at once. 


THEOREM 1. Let u(x,t) satisfy inequality (3): 


" n yu 2 
(Lu)? < e:(t)u’ + e2(t) >> (=) 
i=l 


OX 


in R. Let u vanish on S and suppose condition (8): 














338 M. H. PROTTER 


; 8 2 *. (au \’ 
lim f te" (=) = 0 
tsa D > Ox; 


holds* for some fixed n > 1. If c(t) = O(t*-*), co(t) = O(¢-"), and Ao(t) = O(t-") 
then u =0 in R. 


Proof. We define ¢ 


f(t) as a monotone increasing smooth function of ¢ 


so that 
’ O<t<T, 
f=10<¢<1, Ti<t<To. 


We select 7; to satisfy two conditions. First, 7, is selected larger than the 
quantity 7* determined in Lemma 5. Second, 7; is increased, if necessary, 
so that mofco(t) < 4 for t > T, where mp is the constant in the right side 
of inequality (15). The function 
v(x, t) = ¢(t)u(x, t) 
is in class C(m) and inequality (15) is valid for v. We define R(T, — 7)) to 
be the domain D X (I(T) — I(7))) and R(T?) the domain D X (J — I(T>?)). 
We have from (15) applied to o: 


» Ku’ + cK > (.) < mo | K(Iv)* 
1) 


R(T2) R(T2) OX ; R(T2—-T 
. 
‘ 2 
a Mo | K (Lu) > 
Y R(T») 


since the left side is decreased by omission of the integrals taken over the 
domain R(T2 — 7). We substitute (3) into the last integral on the right 
and get 


J [Ae* — moc (t) Ku’ +f [t~* — moco(t)|K > (#2) 
R(T2) R(T2 


Ox, 
< Mo j (Lv)”. 
Y R(T2—T1 


Since #*-"c,(¢) is bounded we select \ so large that the coefficient of Ku? is 
dominated by 4 Aé*-*. Further the integrals on the left are decreased if the 
range of integration is diminished to R(73) for some 7; > 7». Hence 
’ a27-,2 -lp Ou : 9 , 2 
Ku + t*K > \—) <2mo K(Lu)°. 
R(T) R(Ts) Ox ; SY R(T2—71 


From the definition of K, we obtain 


“ au \?* - : . 
7 vor y f u® +f > (=) | < 2mol ne f zo)’. 
R(T3) R(T) OX; * 


*If c(t) = 0, the square of the gradient in (8) should be replaced by the square of the func- 
tion 4. 














PARABOLIC EQUATIONS AND INEQUALITIES 339 


Letting \— © we see at once that u = 0 for t > 73. Thus u satisfies (3), 
vanishes on S, and vanishes for ¢ > 73. Theorem 1 of Lees and Protter (2) 
now applies, so we conclude that u vanishes identically in R. 

To prove the theorem corresponding to Theorem 1 for operators which are 
not self-adjoint we first establish the inequality analogous to (15). 


LemMA 6. Let v € C(n), nm > 1 and suppose By(t) = o(t-'), By(t) = o(t-'). 
Let v =0 in DX I(T*) where T* depends on Bo, B,. Then for sufficiently 
large } we have the inequality 





20) nf t*"°K (8, d, no? + j rk> (2) < mo | K(Mo)" 
R /R Ox, JR 


t=1 
Proof. The establishment of (20) follows from Lemmas 3 and 4 in the 


same way that (15) was obtained from Lemmas 1 and 2. With the aid of 
Lemma 6 the proof of the following result parallels the proof of Theorem 1. 


THEOREM 2. Let u(x,t) satisfy inequality (A): 
. s n ) 2 
(Mu)” < d,(t)u~ + do(t) » > (2) 
i=l OX; 


in R. Let u vanish on S and suppose condition (8): 


* sane du \* 
lim j p%e™* —|} =(@ 
toa VY D . > Ox; 


holds for some fixed n > 1. If d,(t) = O(t7-*), do(t) = O(t-"), Bolt) = oft), 
B,(t) = o(t~') then u = 0 in R. 

The basic inequalities of Lemmas 1 and 2 vary slightly for the case 7 4 
that is, tor solutions which decay as e~' for some positive \. For this purpose 
we state the following inequalities. 


LeMMA 7. If v € C(1) we have the inequality 
. 7 . 2 . : P . ; n 0) 2 
-s | t°K(B, d, 1)v” < j K(B, d, 1)(Lo)* + mo j Ag(t)K(8, 4,1) >> (2) 
R JR JR f= \OX4 
valid for all 8. This is obtained directly from Lemma 1 by setting n 1. For 


convenience we write K(8, ) for K(8, X, 1). 


LemMA 8. If v € C(1) we have the inequality 
Ro = av \? , > : _ = 
| K(8, d) } > (2) < mo f tK (Lv)* + mo | (X + |ple*)Ko’. 
R f=_1 NOX, R R 
This follows from Lemma 2 by setting 7 = 1. Combining Lemmas 7 and 
8 we get: 


LemMA 9. Let v € C(1) and suppose Ao(t) = o(t-*). Letv = 0 in D X I(T*) 
where T* depends on Ao(t). Then for sufficiently large } and — 8 we have the 
inequality 








340 M. H. PROTTER 


J r*K(B, xo" + | KD (2) < 3m J K (le). 
, . Ox; B K 


i=1 


This lemma is a consequence of Lemmas 7 and 8 in the same manner that 
Lemma 5 is derived from Lemmas | and 2. We thus obtain: 


THEOREM 3. Let u(x,t) satisfy inequality (3) in R. Let u vanish on S and 
suppose condition (8) holds for n = 1. If c,(t), c(t), and Ao(t) are all o(t-*) 
ast— @ then u =0 in R. A similar result holds for inequality (4) pertaining 
to operators not in self-adjoint form. 


The results of this section are easily extended to operators of the form 


a n é 
M,= = de bul!) a5 + e(x, t) 


and the corresponding differential inequality 


(Myu)* < di(t)u* + do(t) >> (ze) 
OX ; 


i=] 


If the function e(x, ¢) is bounded and satisfies the condition 


~=0("), to©@ 


then Theorem 2 is valid for operators M, with the proof unchanged. Similarly 
Theorem 1 holds for operators L; containing a zero order term. In particular 
if e is independent of ¢ the above condition is automatically satisfied and 
merely boundedness suffices. 


3. Cauchy problem with data on a time-like surface. In this section 
we shall be concerned with the uniqueness of the Cauchy problem for the 
general inequality (4) with data given on a piece of time-like surface. In 
other words, we shall suppose that on a portion of the boundary surface S, 
say So, we prescribe 

_ au 


on 
where 0/0n is the derivative taken in a direction normal to S. From this 
we shall conclude that u vanishes in the subregion of R contained in the 
strip 7, < t < T2, where 7, is the minimum value of ¢ in Sp and T+ is the 
maximum value of ¢ in So. The extension to the case where Sp is any time- 
like surface is easily made. For this purpose we need two lemmas similar 
to ones established in (2). We introduce Euclidean distance r in E,, that is, 


n 
r= , * X 4. 
i=1 














PARABOLIC EQUATIONS AND INEQUALITIES 341 


LemMA 10. Let u € C* vanish outside the cylindrical domain R(T): ro <r < 11, 
0 <t<T. Then for r, sufficiently small and for all sufficiently large B we 
have 


(21) p* peer"? . mre J F aie > (2) < my f P**( Mu)’. 
/ R(T) R(T i=l Ox; Y R(T 
Proof. We select r; so small that 
b4s(0, t) = 45 + By 
where 
bis] < mons. 


This can always be done by a change of independent variable. As before, we 


define 
and consider the expression 


We have 


dz @ --0 


+2 27-8 2 re r-8 
(22) rte?" (Gu — u,)* = pt | Ge + 2e > (e ) 


4 ax, Ox, 
, : 2 
+ ze’ “G(e" *) — 2, |. 
We note that 
ax, 
and use the elementary inequality 
(a+b+c¢—d)?> (b—d)? + 2(6 —d)(a +c). 


Interpreting 


a = Gz, 
8 dz O 3 
b = 2e’ b -——(¢" ), 
2. "? Ox, OX; } 
— = 3 
c=ze' Ge ), 
d=2, 


we get from (22) 
p+2 2r-8 2 2 9 2,9 + Oh 9 9 
eo (Mu) > b° — 2bd + d° + 2ab + 2bc — ad — 2cd. 


We now integrate throughout (x, f) space. Each integral which contains },, 
is further decomposed into integrals with 6,,, the principal part and ,,°, the 
residual part. Thus, for example, the principal part of 2ad is 











342 M. H. PROTTER 


= dz 
26 f Zz *:3,,% 


and this integral is non-negative. The residual part leads to an integral of 


the form 
o-6 <&. lan \ 
B mor fe ym (=) ° 
i=1 OX; 


The integrals 6? — 2bd + d* — 2ad yield a positive definite quadratic form 
for sufficiently large 8. The principal part of the integral 2cd vanishes. The 
integral 2bc yields the term 
. 
a‘ | y7 28-2 ,27-F 2 


These combine to give (21). 


LeMMA 11. Under the hypotheses of Lemma 10 we have 


. a 2 . 
oe 4 ——2 2r-B 2 27-8 Ou 842 27-8 : 
(23) 6 J; ec u + Bm. | eo 2 (= <m |} re” (Mu). 
OX) e 
Proof. For functions u with compact support and an arbitrary C? function, 
a(x), independent of t, we have the identity 


Fane. — Gu) = — fauGu 


’ Ou du Af | : 
= fa 2», by @x, Ox, 2 u'| Gut+ > a 





TF 9 04 Oba ] 
OX OX; OX, OX; 


Since G is uniformly elliptic, when we select 
27-8 
a=e 
we get 


9 . : 
27-8 ou \ 2r-8, 2 —28-2 27-B 2 
=. (=) < mo j ee” \u Mu| + moB | r ec a 
e e 


OX; 


We apply Cauchy’s inequality to the first term on the right and obtain 


. “ 2 . . 
2r-8 Ou B+2 27-8 2 2 —2—2 2r-F 2 
fe me» (=) < mo? | re (Mu)” + mo8 J r es 
Ox ; 


We multiply this inequality by 8 and add to (21). For 8 sufficiently large 
and 1; sufficiently small we deduce (23). 


THEOREM 4. Let u satisfy inequality (4) in a region R(T) and suppose that 
on a portion So of the boundary S the condition 


holds. Then u = 0 in the subregion of R(T): T, <t < T2 where T, is the 
minimum and T>, the maximum value of t in So. 











ee Oo 


PARABOLIC EQUATIONS AND INEQUALITIES 343 


Proof. We select the origin of our co-ordinate system outside of R +S 
but so close to a point of S» that the distance rp of Lemma 10 is exterior 
to R + S while the distance r; is interior to R + S. 

We define the functions ¢,(r), ¢2(¢) so that 


[ 1 », O<ren 
filr) = 4 0<%<1 , were 
{ 0 ’ r2>re 
and 
0 «8 D>Tr 
O<f-<1 , Kt <T; 
f2(t) = 4 1 » Tre<t<T 
O0O<f< 1 ’ Ta <t<Ts 
0 , sen 


where 7; < 72 and 7, > 7; and the functions ¢;, ¢2 are in C*. In general 
we denote by E(r;, T7;, T,) the region 0< r <r, T, Ct < Tr. We now 
define the function 

v = f1(r)fo(d)u. 


Then »v satisfies the conditions of Lemmas 10 and 11 so that (23) applied to 
v vields 
2 av \* 
4 —268—2 2r-F 2 27-8 ov 
B et + amy f e 2 (2) 
© B(r1,76,Ts) © E(r1,T6,Ts) OX 


2 


p+2 8 2 
< mf re (Mov)’. 
© B(re,.7¢,T3) 
Taking into account the fact that ¢; and £2 are identically 1 in certain ranges 
of the variables we have 


2 
a4 —2—2 27-8 2 2r-8 ou 
B r eu? + mo | e" > (2 
© B(r1,.7s,74 


B(ri,7s,.T«) Ox ; 


. 


4 —%—2 2r-F, 2 
+, | r e” Sou 
e 


B2(r1,74,73)+2(11,76,T 


2 -8 Ou 
+ amy f ev” fs > (2) 
B(r1,74,73)+E(11 76,75) OX ; 


< mo f Pre (Mea) + maf pPtte2r-? (My)? 
B(ri.7¢.Ts) 


E(re.7¢,T3)—EB(ri 76,73) 


We note that (Mfou)? = (fo (t)u + £2(Mu))? < 2f0'2u? + 2¢.?(Mu)*. Hence 
the first term on the right-hand side is dominated by 


6+2 27-8 2 
mo f re” (Mu) 
~ B(r1.75,74) 


6+2 27-8. 42 2 ast 2 
+ ame f re” [ts 2¢2(Mu)’}. 
B(r17473)+2(11,76,75) 


+45 











344 M. H. PROTTER 


In these integrals we replace (Mu)? by larger quantities as given by (4) to 
obtain 


; ere 27-8 du \° 
f ty 28. ‘ec’ u+Bmoe ; ( ) 
e 


B2(r1,.75,T4) OX ; 


° 2 
4 —28-2 2r-B. 2 27-8, Ou 
+{ Br e Cott o B mo e€ : ¢2 :2 (#.) 
EB(r1,74,73)+E(n1,76,7s Ox; 
. 3 1 2 
B+2 2r- 2 ou 
< mo Pre lu + > (2 
E(r1,75,74) OX; 
. a 2 
‘ B+2 2r-F] .,2 2 as%..2 ine u 
+ 2m r e r | A oa 2% ol + 2f2 y (=) | 
B(r1,74,73)+E(11,T6,Ts) Ox; 


. 
B+2 2 2 
+ mo | re" (Mo). 
E(re.7¢,73)—E(1r1,.76,T3) 


For 8 sufficiently large the first integral on the left dominates the first integral 
on the right and the second integral on the left dominates the second integral 
on the right. Thus we find 


J i 2,37? 2 + Bmyo er > (.) 


E(r1,75,T4) Ox ; 


my 
~ 
= 


2 27-8 2 
< mo | ‘di é (M>v) . 
BE(r2,.T¢,.T3)—E(r1 76,73) 


We now select r; < r; but sufficiently large so that the cylinder of radius rz, 
axis along x = 0, intersects R. Then the above inequality is strengthened 
if the domain of integration on the left is reduced to E(r3, 75, 74). The above 
inequality may now be replaced by 


. + 2 
4. —28—2 273-8 2 2r3-8 Ou 
i oo J u- + Bmoe™ J p (2: 
B(r3,75,T4) E(r3,75,T4) Ox; 


. 
< mht | (Mv)° 
B(r2,.7¢,.73)—EB(r1,76,7T3) 
where 7 is the minimum value of r in E(re, 7s, 773) — E(r, Ts, 73). We note 
that from the manner in which the domains were determined the quantity 
7 is larger than r3. Now letting 8 — © we easily conclude that uw = 0 in 
E(rs, Ts, Ts). Proceeding step by step we conclude that u = Ofor 7; < t < T2 
and the proof is complete. 
From the method of proof it is clear that the extension to zero Cauchy 
data given on a piece of an arbitrary time-like surface is immediate. 








il 


d 


re 





PARABOLIC EQUATIONS AND INEQUALITIES 345 


REFERENCES 


1. P. D. Lax, A stability theorem for solutions of abstract differential equations, and its application 
to the study of the local behavior of elliptic equations, Comm. Pure and Applied Math., 9 
(1956), 747-766. 

2. M. Lees and M. H. Protter, Unique continuation for parabolic differential equations and in- 
equalities, to appear. 

3. M. Lees, Asymptotic behavior of solutions of parabolic differential inequalities, to appear. 

4. S. Mizohata, Unicité du prolongement des solutions pour quelques opérateurs differentiels 
paraboliques, Mem. Coll. Sci. Univ. Kyoto, Ser. Al, 37 (1958), 219-239. 

5. M. H. Protter, Unique continuation for elliptic equations, Trans. Amer. Math. Soc., 95 
(1960), 81-91. 


co 


‘niversity of California, Berkeley 








GRAPH THEORY AND PROBABILITY. II 
P. ERDOS 


Define f(R, /) as the least integer so that every graph having f(k, /) vertices 
contains either a complete graph of order & or a set of / independent vertices 
(a complete graph of order & is a graph of k vertices every two of which are 
connected by an edge, a set of / vertices is called independent if no two are 
connected by an edge). 


Throughout this paper ¢;, co, . . . will denote positive absolute constants. It 
is known (1, 2) that 
' i+ 1 
(1) i * < 48,1 <( . ), 


and in a previous paper (3) I stated that I can prove that for every « > 0 
and / > I(e), f(3, 1) > /?-«. In the present paper I am going to prove that 


(log 1)* * 





(2) f(3,)) > 


The proof of f(3,/1) > /'*“ was by an explicit construction. I can only 
prove (2) by a probabilistic argument, and I cannot explicitly construct a 
graph which satisfies it. The method used in the proof of (2) will be a com- 
bination of that used in (3) with that in my recent paper (4) with Rényi. It 
is possible that (2) can be strengthened to f(3, /) > c/?, but it seems impossible 
to improve (2) by the methods of this paper 


THEOREM. Let A be a fixed, sufficiently large number. Then for every n > no 
there is a graph G having n vertices, which contains no triangle and which does 
not contain a set of [An? log n] = x independent vertices. 


Clearly our theorem implies (2). 
To prove the theorem put y = [n*/*/A'/?]. Denote by @” the complete 
graph of m vertices and by © any of its complete subgraphs having x ver- 


: , aoe n 
tices. Clearly we can choose G® in ( ) ways. Let 
\ * \ 


n 
(3) G.”, Il<a< (°) = 
be an arbitrary subgraph of G@™ having y edges (we use the notations of 
(3)). Now we need 


Received June 24, 1960. 
346 


No 
es 


te 


GRAPH THEORY AND PROBABILITY. II 347 


LemMMA 1. Almost all G. have the property that for every @®@ there is an 
edge €a,z contained in both G, and ©, which is not contained in any triangle 
whose edges are in ©, and whose third vertex is not in @®, 

“Almost all’’ here means for all but o(¢) graphs G,. We could prove 


‘ 


Lemma 1 even if we would omit the words ‘‘and whose third vertex is not 
in @®,”" but the proof would become very much more complicated, and 
Lemma 1 suffices for the proof of our theorem. 

The proof of Lemma 1 will be difficult and we postpone it. Assume that 
the Lemma has already been proved, then it is easy to prove our theorem. 
Let G, be one of the graphs which satisfy Lemma 1. We construct a sub- 
graph G, as follows: Let e;@, ex@,...,e,@ be an arbitrary enumeration 
of the edges of G,. We put e:@ C G,.™ and we have e, C G, (1 <k < y) 
if and only if e.@ does not form a triangle with the edges e,@, 1 <r <k 
which we had already put in ©”. ©, has n vertices, contains no triangle, 
and does not contain a set of x independent vertices. The first two statements 
are obvious; now we prove the third one. It will suffice to show that for 
every G6 G@@ (7) G,™ is not empty. Consider the edge e.,, = e, (see Lemma 1), 
if it is contained in ©,” our statement is proved, if not there must exist a 
triangle e;, ¢;, ¢r (i < 7,7 < 7), whose edges are all in G@,. But by Lemma 1 
the third vertex of this triangle must be also in G®, thus e, C G@®, e, C G™, 
or e, and e, are both in G@® 7) G,™. This completes the proof of our third 


n 


statement, and thus if we put G, = @ the proof of our theorem is complete. 

If we had proved Lemma 1 in the stronger form without the words ‘‘and 
whose third vertex is not in @®,”’ we could have defined G™ as the union 
of those edges of G, which are not contained in any triangle of G," 

To complete vur proof we now have to prove Lemma 1. First we need 
some lemmas. Denote by £,(@@) the number of edges in @, connecting 
the vertices in G@@ with the vertices not in @@ 

LemMaA 2. For almost all G, we have 


. + ( 4/3 
(4) max E,(@"’) < [n*"] = m, 
: , n : = 
where the maximum is taken over all the possible choices of O° 
x 


We could easily prove the lemma with (1 + 0(1))2A4m, but (4) will suffice 
for our purpose. 
The number ¥t(m) of a’s for which (4) is not satisfied is not greater than 


a n\{x(n — x) &) —m n\(nx (”) —m 
(5) Nim) < 2 < 2 
x m x m 
yom y-m 


‘a a n aes 
To prove (5) observe that there are ( ) choices for @®, and the number 
x 


of edges in @” connecting the vertices of G@® with those not in G® is 


x(n — x). Thus (5) follows by a simple combinatorial argument. 














348 P. ERDOS 


In estimating binomial coefficients we will make use of the following simple 
inequalities 


u u” ou \" 
(6) () <7< (:#) : 
v v! v 


and 
(") 
9 a 
(8) ~S = sa = I > , for n > 3. 
n 2n 3 


From (5), (6), (7), and (8) we have (by substituting the values of x, y, and 


m) 
, omnx \" { 3y\" -{ 10xy \" 
Nim) /t < (22) ( 2) <n ( =2) = o(1), 
m n2 nm 


which proves the lemma. 


LemMA 3. For almost all ©, the degree of every vertex of Gq" is less than 


[10(2)'] = » 


By a theorem of Rényi and myself (4) it follows that p can be replaced by 


b 
(1+ o(a))2(%) , 


but the weaker result will suffice here. 
The number of a’s for which the condition of Lemma 3 is not satisfied is, 
by a simple combinatorial argument, less than 


»-1)((*) - \((*) - 
nf ) 2 P < nl ) y P : 
P y—-p p —" 
(since the number of @, for which a given vertex has degree > ? is 
(" _ ‘) (”) —p 
P y¥=-*f 


and there are m possible choices for this vertex). From (6), (7), and (8), we 


have 
n ; m 
n(”) (") rf A < a( 3 ») < 
MN y=? ” 


which proves the lemma. 














GRAPH THEORY AND PROBABILITY. II 349 


Put 
(9) , = (2'A?’ log nl], i m@: Ossi 
and 
; l 
lw, = | 7; fs| for0 <i < -logn 
(10) } 4° 1) 4 


Ww AB for £1 <4 
|W, = oO og n 1. 
‘ 4% 4°” 


We shall say that G, has property P, if there exists a @@ and an i > 0 so 
that there are at least w, vertices not contained in @®, each of which is 
connected in @, with at least z,; vertices of @®. 


LemMMA 4. The number of graphs ©, which have property P,; for some i 
is o(t). 


n 


Since by Lemma 3 we can assume that the degree of every vertex of G, 
is less than p, we can assume that for sufficiently large A 


. } } 


Thus there are less than log ” choices of i, and it will suffice to show that 


for every i satisfying (11) the number of a’s for which G,™ satisfies P,; is 
o(t/log n). Denote by ®, the number of a’s for which G,™ satisfies P;. A 


simple combinatorial argument shows that 


ofl(n\ —we 
, ~{n\{(n —x\f{x \' (”) ; 
(12) Ni< 2 
x Ww, Z 
y—we 


os Ww ie 


n n—Xx 
lo see (12) observe that there are ( ) ways of choosing @”; ( ) wavs 
x : WW; 


¢ 


of choosing the w, vertices not in @®, which are connected with at least z 


: _ 2\" ‘ : , ’ 
vertices of G®; ( ) ways of choosing the vertices in @”, with which the 


w, vertices not in @® are connected in @,”. For the remaining y — wz 


edges of G,” there are clearly 


choices; thus (12) is proved. From (12), (6), (7), and (8) we have, by 


Ww < Ai n? log n, 
} wi n P ws 24 
1 MN, n n\{x ( ) - Wie; r+wi( BeXY 
7 < 2 t<n 5 
x/ \w 4 zm 
(13) y—we 


1 - 
z+v;( 10A? log n\"** 
n - 








350 P. ERDOS 


Now 2** > n since 2; > [A*/* log m]. Thus 2”*** > n”‘, hence from (13), by 
substituting z, = [2‘A?/* log 2], we have for sufficiently large A 


N -( 304! \*** of 1 \"* 
(14) ts < w( 24) < w'(shs) , 


Assume first 0 < i < } log nm. Then from (9) and (10) we have 





ail } 
%G4+i1)°”° 


From (14) and (15) we have (exp u = e”) 


(16) * <n exp(—n'log 2) = o(2) : 


(15) W i > 





n 


Assume next i > } log m. From (9), (10), and (11) we have, by i < logn 
for sufficiently large A, 


A 2/3 | as 
(17) W 24 > er ad a > A*yh log n. 





Thus from (14) and (17), by 2‘! > n!/19, 


¥ 3 9 9 l 
(18) 7 < n* exp(—A*7n} (log n) /10) = o(4) 
for sufficiently large A. Equations (16) and (18) complete the proof of 
Lemma 4. 


LemMaA 5. Almost all © have the property that for every G@@ there are more 


than (5 edges of G@@™ which do not occur in any triangle, the other two sides 


- 


of which are in ©, and whose third vertex is not in GO. 


We could prove Lemma 5 even if we omit the words “and whose third 
vertex is not in @®,” but the proof would be more complicated and Lemma 5 
in its present form suffices for our purpose. 

Denote by u,™, ue, ... , u,—-™ the number of edges in G@, which con- 
nect the  — x vertices of G@™ not in © with the vertices of G@®. The 
number of edges of © which are contained in triangles the other two sides 
of which are in @, and whose third vertex is not in @™ is clearly at most 


n—z (a) 
ae (u' ) 

5) - 
j=l on 


Thus to prove Lemma 5 it will suffice to show that for almost all a we have 
for every choice of G@@ 


n—Zz us 
(19) > (u' <; 


nine 
| 
nw R 
Sa” 








of 








GRAPH THEORY AND PROBABILITY. II 351 


By Lemma 4 we can assume that @, does not satisfy P, for all i > 0. But 
then the number of indices j for which u, > 2, is not greater than w, for 
all z > 0, or by (9) and (10) and wp = n 


n—z (a) 2444/3 2 
(20) yo (w )< > w (5+) < > n2"A™ (log n) 4 


= 1 4°(i + 1)? 








> n2”*'A*" (log n)* 

2 4°i . 

where in 3,0. <i < flog; and in D> ,jlogn <i <logn 
1 9 


by (11). Thus, finally, from (20), 


n—z (a) 2 
> (¥ ) < = A*n(log n)* + 4A“"n(log n)* < ; (:) 


for sufficiently large A, and this proves the lemma. 

Now we can prove Lemma 1. It suffices to consider those G, which 
satisfy Lemmas 2 and 5 (since the number of the other graphs is o(¢)). Let 
@® be a fixed graph having x vertices. We are going to estimate the number 
of graphs @,“ which satisfy Lemmas 2 and 4 and which fail to satisfy Lemma 1 
with respect to @® (that is which do not contain an edge ¢.,, C G@™ (\ G™, 
where ¢é,,, is not contained in any triangle whose other two sides are in @, 
and whose third vertex is not in @@). Let us assume that we have already 
chosen the u edges e;", eg,..., e, (u = uz) which connect (in G,) the 
vertices of G@® with the vertices not in @@. Since Lemma 2 holds we have 
u < n*/3, The number of the @,™ for which e;, eo,...,e, are all the 
edges which connect the vertices of G@® with those not in @® clearly equals 


(5) - x - 
(21) 2 oF — 87D we Riot”, ... ae”), 


yoru 


, : n 
since we have at our disposal (3) — x(m — x) edges and have to choose 


i x - , 
y — u of them. But by Lemma 5 there are at least i(5) edges of @@ which 


do not form a triangle with any two of the e,;'s 1 <i<u, and if we put 
any of these edges in @, Lemma 1 will be satisfied. Hence the number 


MN’ (ey, ..., ey) of graphs, which do not satisfy Lemma 1 with respect to 
@® and for which the edges connecting the vertices of @@ with those not 
in @ are e;™,..., e,, satisfies (u < n*/* < y/2 for n > mo(A)) 

(") x(n — x) . () 
(22) N’(ef?,...,e) <| \2 , ~ oe 


yoru 





352 P. ERDOS 


Thus from (21), (22), and (7), we have 


(" ' ; 1 {x 
N'(e”,... <| 2 “a ee 
‘ 


(23) = 
x” \9" x“"y 
_ on? < exp “i ‘ 





Riey”,... 


Since (23) holds for all choices of e;”,... which satisfy Lemmas 2 
and 4, we obtain that the number of @, which satisfy Lemmas 2 and 4 
but do not satisfy Lemma 1 with respect to G® is less than 


(24) t exp( - =z) : 


@ n : ; : 
Since these are ( ) choices for @@ we obtain from (24) and Lemmas 2 
x 


and 4 that the number of graphs @,“’ which do not satisfy Lemma 1 is less 


n 
than (( ) < "’) 
- 


x 


Yy —w y aw _s» 
-=3) + o(t) < texp(x log nm) exp( 1) + o(t) 


= texp((1 + 0(1))An' (log n)’) exp[— (1 + 0(1))A**n} (log n)*/4] + o(t) 
o(t), 
which completes the proof of Lemma 1. Thus our theorem is proved. 
The difficulty of trying to improve our theorem by the methods used in 
this paper is due to my belief that there exists a constant c; = ¢3(A) so 
that almost all graphs G, contain an independent set of [c32*/? log n] ver- 


tices. | am unable at present to prove or disprove this conjecture. 


REFERENCES 


. P. Erdés and G Szckeres, On a combinatorial problem in geometry, Compositio Math., 
(1935), 463-470. 

. P. Erdés, Remarks on a theorem of Ramsey, Bull. Research Council of Israel, Section F, 7 
(1957). 

. P. Erdés, Graph theory and probability, Can. J. Math., 11 (1959), 34-38. 

. P. Erdés and A. Rényi, On the evolution of random graphs, Publ. Inst. Hung. Acad. Sci., 5 
(1960), 17-61. 


Australian National University, Canberra 




















To be published Spring, 
1961 


REPRESENTATION 
THEORY OF THE 
SYMMETRIC 
GROUP 


BY G. DE B. ROBINSON 


This book is devoted to a study of the linear 
representations, both ordinary and modular, of the 
symmetric group ©,, which has come to play an 
important role in many different contexts. A sys- 
tematic use of Alfred Young’s ‘tableau’ approach 
yields constructions which are straightforward and 
easily understood. An important feature of the book 
is the evidence which it provides that a modifica- 
tion of the inducing process is much to be desired 
in the modular case. Mathematical Expositions 
Series, No. XII. 


224 pages 6 x 9 inches $6.00 





Other recent books in the Mathematical Exposition Series 


UNIVERSITY 
OF TORONTO 
PRESS 


PARTIAL 
DIFFERENTIAL 
EQUATIONS 


BY G. F. D. DUFF 
x + 248 pages 6 x 9 inches 


VARIATIONAL 
METHODS FOR 
EIGENVALUE PROBLEMS: 
An Introduction to the Methods 
of Rayleigh, Ritz, Weinstein, and 


Aronszajn 
BY S. H. GOULD 
xiv + 179 pages 6 x 9 inches $6.00 


DIFFERENTIAL 
GEOMETRY 


BY ERWIN KREYSZIG 








xvi + 352 pages 6 x 9 inches 











Wily BOOKS 


INTRODUCTION to GEOMETRY 
By H. S. M. COXETER, University of Toronto. Not only reveals the 
inherent interest of geometry itself, but also shows its usefulness in the study of 
kinematics, crystallography, statistics, and botany as well as in the study of other 
branches of mathematics. 196]. Approx. 384 pages. Prob. $9.75. 


INTRODUCTORY ALGEBRA 
By MILTON D. EULENBERG and THEODORE S. SUNKO, both of 
the Chicago City Junior College. Readability, teachability, and a mature 
approach characterize this distinguished new introductory text designed for the 
coliege student with little high school preparation in the subject. 1961. 290 
pages. $4.95. 


COLLEGE ALGEBRA 
By ADELE LEONHARDY, Stephens College. Written in the belief that 
an understanding of the logical nature of algebra strengthens the ability to 
perform algebraic techniques. 1961. 440 pages. $5.95. 


BOUNDARY and EIGENVALUE PROBLEMS 
in MATHEMATICAL PHYSICS 


By HANS SAGAN, University of Idaho. The theories of orthogonal 
functions, Fourier series, and Eigenvalues are developed from boundary value 
problems in mathematical physics in this new and stimulating book. 1961. 
Approx. 416 pages. Prob. $9.50.* 


ESSENTIALS of MATHEMATICS 


By RUSSELL V. PERSON, The Capitol Radio Engineering Institute, 
Washington, D.C. An introductory text providing the student preparing for one 
of the various fields of tecunology with the kind of mathematical background 
that will be of the greatest value in later technical study. 1961. Approx. 560 
pages. Prob. $7.00. 


INTRODUCTORY ALGEBRA for COLLEGE STUDENTS 
By IRVING DROOYAN and WILLIAM WOOTON, both of Pierce 
College. Draws upon the student’s intuitive understanding of simple algebra by 
presenting algebra as a generalized arithmetic and stressing fundamental 


assumptions underlying both arithmetical and algebraic operations. 1961. 272 
pages. $4.95. 


ADVANCED CALCULUS, An Introduction to Analysis 
By WATSON FULKS, Oregon State College. A well-organized book 
placing emphasis on a rigorous re-examination of one-variable calculus, then 
the calculus of several variables with vector methods used extensively, and, 
finally, the theory of convergence as applied to series and improper integrals. 
1961. Approx. 552 pages. Prob. $11.25.* 


ELEMENTS of MATHEMATICAL STATISTICS 
By HOWARD W. ALEXANDER, Earlham College. Presents an intuitive 
approach to the subject, accomplishing this by the use of many completely 
analyzed examples and by relating each new idea to already familiar ideas or 
to simple sampling devices. 1961. Approx. 380 pages. Prob. $7.75. 


MODERN TRIGONOMETRY 


By DICK WICK HALL, Harpur College, and LOUIS O. KATTSOFF, 
Boston College. Analytic in approach, this book emphasizes the ability to 
reason about trigonometric functions; it features the early introduction and 
simultaneous treatment throughout the text of polar coordinates and radian 
measure. 1961. 236 pages. $4.95. 


*Textbook edition available for college adoption. Send for examination copies. 


Reserve your examination copies today 


UNIVERSITY OF TORONTO PRESS Toronto, Ontario 








